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FOR FURTHER ACTION Prafimtnaiy Examinafion Report (Fonn PCT/iPEA/4ie) 


IniBmailonal appKoaOon No. 
PCT/CAOO/00725 
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16/D6/2000 
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1. Thi$N«niatiQnalprelimiriaryexannin^^ 

and liansriUtted to the applioam aoooiitf ng to Articia 36. 

2. This REPORT consists of a total of 7 sheets, including this cover sheet. 

□ Thi» report is also acoompanied by ANNEXES. Le. sheets of the description, daims and/or drawings which have 
been amended and are the basis for this report and^ sheets containing rectifications rnade before this Authority 
(see Rule mie and SeoUon 607 of tfie Administrative Instiuctions under the PCT). 

These annexes consist of a total of sheets. 



3. This report oontatt>s indications relating to the following Hems: 



1 


8 


II 


□ 


III 




IV 


□ 


V 




VI 


□ 


VII 


□ 


VIII 





cQations and explanations suporting such statement 
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TeL449882399-0 Tx:52S656epnmd 
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INTERNATIONAL PREUMiNARY 

EXAMINATION REPORT 



intemationaJappKcaironNo. PCT/CAOO/00725 



L Basis of the report 

1 . With regard to the elements of the international application (Replacement sheets which have t>eert furnished to 
the receiving Office In response to an invitation under Article 14 are referred to in this report as 'originaify filerT 
and are not armexed to Iftfe mport s/nce they do not contaki amendments (Rufes 70. 16 and 70, 17)): 
Desertptlonp pages: 

1 -245 as originally filed 



Sequence listing part of the description, pages: 
1, as originaJly filed 

2. With regard to the language, all The elements marked above were available or furnished to this Authority in the 
ianguage In which the intemafional application was filed, unless othenvise indicated under this item. 

These elements were available or furnished to this Authority in the f dlowtng language: , which ts: 

□ the language of a translation furnished for the purposes of the inten^ationaJ search (under Rule 23. 1 (b)). 

□ the language of publication of the intemational applicacion (under Rule 4d«3(b)). 

□ the language of a translation fumisiied for the purposes of international preliminary examination (under Rule 
55^and/or55^). 

3. With regard to any nucleotide and/or amino acid sequence disclosed in the intemational application, the 
Intematiortal preliminary examination was earned out on the basis of tiie sequence listing: 

EI contained in the international application in written form. 

□ filed together with the international application in computer readable fomn. 

□ furnished sutwequently to this Authority in written form. 

12) furnished sui>sequently to this Authority In computer readable form. 

IS The statement that the $ui>sequently furnished written sequence listing does not go beyond the disclosure in 
the intemational application as filed has been furnished. 

19 The statement that the information recorded in contputer readable fomi is identical to the written sequence 
listing has been furnished. 

4. The amendments have resulted in the car^lation of: 



Claims* No.: 



1-35 



as originally fQed 



Dra wings, sheets: 



1/53-^3/63 



as originally filed 
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INTERNATIONAL PRELIMINARY 
EXAMINATION REPORT 



imemational application No. PCT/CAOO/00725 



□ 
□ 
□ 



the description, 
the ciafms, 
the drawings. 



sheets: 



pages: 
Nos.: 



5. O This report has been established as if (some of) the amendments had not been made, since they have been 

considered to go beyond the disclosure as filed (Rule 70.2(c)): 

(Any replacement sheet containing such amendments must be referred to under item 1 and annexed to this 
report,} 

6. Additional observations, rf necessary: 

ni. Noivestablishinent of opinion with regani to novelty. Inventive step and industrial applicability 

1 . The questions whether the claimed invention appears to be novel, to involve an inventh^ step (to be non- 
obvious), or to be industrially applicable have not been examined in respect of: 

□ the entire intemationai application. 

la damns Mos. 20, 27* 29-32 aU partially, 34» 35 completely. 



[S the said Intemationai application, or the said claims Nos. 34 and 35 relate to the following subject matter 
which does riot require an international preliminary examination {specity^: 
see separate sheet 

□ the description, claims or drawings {indicate particular ehments t)eloWi or ssud claims Nos. ere so unclear 
that no meaningful opinion could be fomied (specsflV): 



□ the cteims» or said claims Nos. are so inadequately supported by the description that no meaningful opinion 
could be formed. 

no international search report has been established for the said daims Nos. 20, 27, 29-32 all partially. 

2. A meaningful International prePminary examination cannot be canried out due to the failure of the nucleotide 
and/or amino acid sequence listing to comply with the standard provided for in Annex C of the Administrative 
Instructions: 

□ the written fonn has not been furnished or does not comply with the standard. 

□ the computer readable form has not been f umished or does not comply with the standard. 



V. Reasoned statement under Article 35(2) with regard to novelty^ inventive step or industrial applicability; 
citations and explanations supporting such statement 



because: 
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1. Statement 



Novelty (N) 



Yes: 
No: 



Claims 

Claims 1-7. 1 1, 13-16. 18-20, 27, 29 



inventive step (IS) 



Yes: 



Claims 



No: Claims ^-3^ 

Industrial applicability OA) Yes: Claims 1<30 

No: Ciaims 

2. Citations and explanations 
see separate sheet 

Vin. Certain olsservations on the International application 




i, description, and drawings or on ttie question whether the 

le: 



see separate sheet 
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For reasons of better comprehension, the present opinion does not stick to the usual 
division into points l.-Vill. 

Claims 1-10 and 14-19 are directed to two- or three-dimensional stmctures. The daims 
are therefore not directed to the product per se but only to the strucural infoimation of 
the product. These daims are neither product claims nor method claims but appear to 
be mere presentation of information. The subject-matter of these claims ie therefore. In 
prindple. excluded from examination under Artide 34(4){a)(1) in combination with Rule 
67.1(v)PCT. 

Nevertheless, the following comments on novelty, inventive step, and darity have t>een 
made: 

Reference is made to the following documents: 

D1 : CHARNOCK SIMON J ET AL: 'Structure of the nucleolide-diphospho^ugar 
transferase, SpsA from Bacillus subtills, in native and nudeotide-complexed 
forms/ BIOCHEMISTRY, vol. 38, no. 20, 18 May 1999 (1999-05-18), pages 
6380-6385 

D2: NISHIKAWA Y ET AL 'CONTROL OF GLYCOPROTEIN SYNTHESIS 

PURIFICATION AND CHARACTERIZATION OF RABBIT LIVER UDP-N- 

ACETYLGLUCOSAMINE ALPHA-3-D-MANNOSIDE BETA-1 2-N 

ACETYLGLUCOSAMINYLTRANSFERASE I' JOURNAL OF BIOLOGICAL 

CHEMISTRY, vol. 263. no. 17. 1988. pages 8270-8281 
D3: ULLMAN C AND PERKINS S: 'A dassification of nueleotlde-diphospho-sugar 

glycosyltransferases based on amino add sequence similarities' BIOCHEMICAL 

JOURNAL, voL 326, 1997, pages 929^1 
D4: KUNTZ I D ET AL: "STRUCTTJRE^BASED MOLECULAR DESIGN" ACCOUNTS 

OF CHEMICAL RESEARCH, US, AMERICAN CHEMICAL SOCIETY. 

WASHINGTON, vol, 27, no. 5, May 1994 (1994-05), pages 117-123 
D5: QSCHWEND D A ET AL "MOLECUUVR DOCKING TOWARDS DRUG 

DISCOVERY' JOURNAL OF MOLECULAR RECOGNmON,GB.HEYDEN & 

SON LTD., LONDON, vol. 9, 1996, pages 175-186 

1. Claims 20, 27, 29-32 have only been searched partially. Claims or parts of 
claims that have only been searched partially need not be the subject of an 
international preliminary examination. Rule 66.1(e) PCT. Thus, said claims are 
only examined with respect to UDP-GIcNAc. 

Claims 34 and 35 concern the mere presentation of infomiation and are thus 
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excluded from examination according to Rule 67.1 (v) PCT. 
Clarity and Novelty 

Claims 1-7, 1 1 . 13-16, and 18-33 do not fulfil ttie requirements of Article 6 PCT. 
The reasons for this lack of clarity are as follows: 
Product claims 1-7, 11 , 13-16, 18 and 19 directed to structures of 
glycosyltransferases (GT) do not define said structures at all or in a way that 
penmlts the man skilled in the art to clearly comprehend the subject-matter 
claimed. Said claims are so broad that they also are not novel over D1 -3 which 
describe glycosyltransferases In solution or In crystal fomi as well as bound to 
substrates (Non-fulfilment of Article 33(2) PCT). 

All claims directed to modulators, inhibitors of GT or methods for obtaining such 
modulators or inhlbftors. namely claims 20, 22-33 are not supported by the 
description since, apart from the natural substrate UDP-GlcNAc, neither such 
compounds are disclosed nor are methods Indicated that allow for the obtention 
of such compounds (i.e. the applicants actually do not provide any experimental 
data. If on the other hand the general and vague description of the claimed 
methods is sufficient for the obtention of such a compound, then said methods 
and compounds would certainly not be inventive). Of course, this objections 
holds also for the medical use of such hypothetical modulators, as defined In 
claims 31-33. 

Claim 21 In its broadness lacks a recognisable technbal problem and is therefor 
unclear. 

Product claims have to be novel as such. D2 describes UDP-GIcNAc as a ligand 
of QTn-1. Thus, claims 20, 27, and 29 lack novelty (Article 33(2) PCT). 



3. Inventive step of claims 8-10, 12, 17, 21-26, 28, and 30-33 

Claims 8-10, 12 and 17 define structures of GT by atomic coordinates. However, 
ft is not clear what technteal effect could be associated with these claims. Thus, 
the underiying technical problem is the elucidation of the crystal structure of a 
GT. This Is a scientific project, and not an invention characterised by a technical 
effect. Since there is no technical problem defined, said claims cannot be 
inventive. 

Moreover. D2 describes the purification of the Identical enzyme of the present 
application. D1 discloses the crystal structure of the GT, SpsA, from B. subtills. 
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D3 provides extensive analytical data on sequences and stmcture comparisons 
by sequence alignments of over 500 GT. The obtention of crystals and X-ray 
stnjctures from a member of theses GT families is a trivial problem. 
As stated under point 2., claim 21 lacks a technical problem, and therefore an 
Inverrtive step. 

Claims 22-26 do not contain any special technical feature (see also point 2.). 
Molecular modelling of protein ligands (e.g. inhibitors, substrates) and staicture- 
based molecular design are well known in the prior art (e.g„ D4 and 05). Thus, 
said claims are trivial. 

TTie features of the present claims 28 and 30-33 are either trivial or conventional 
in the art or within the competence of a skilled man seeking to Improve the prior 
an processes mentioned in the search report and in the present opinion. 
Thus, all dalms mentioned under this point are not inventive. 



4. 



For the assessment of the present claims 31-33 on the question whether they 
are industrially applicable, no unified criteria exist In the PCT Contracting States. 
The patentability can also be dependent upon the formulation of the claims. The 
EPO, for example, does not recognize as industrially applicable the subject- 
matter of claims to the use of a compound in medical treatment, but may allow, 
however, claims to a known compound for first use in medical treatment and the 
use of such a compound for the manufacture of a medicament for a new medical 
treatment. 
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Applicant's or agenTs file reference 

P174-PCT11 


pQp FURTHER Notification of Transmittal of International Search Report 

(Form PCT/ISA/220) as well as, where applicable, item 5 below. 

ACTION 


International application No. 


International filing date (day/month/year) 


(Earliest) Priority Date (day/month/year) 


PCT/CA 00/00725 


16/06/2000 


18/06/1999 


Applicant 






RINI, James 







This International Search Report has been prepared by this International Searching Authority aiKj is transmitted to the applicant 
according to Article 1 8. A copy is being transmitted to the International Bureau. 

This International Search Report consists of a total of 5 sheets. 

[X| It is also accompanied by a copy of each prior art document cited in this report. 



1 . Basis of the r^>ort 

a. With regard to the language, the intematronal search was carried out on the basis of the international application in the 
language in which it was filed, unless otherwise indicated under this item. 

I I the international search was carried out on the basis of a translation of the international application furnished to this 
' — ' Authority (Rule 23.1(b)). 

b. With regard to any nucleotide and/or amino acid sequence disclosed in the international application, the international search 

was carried out on the basis of the sequence listing : 
[X| contained in the international application in written form. 
I I filed together with the international application in computer readable form, 
furnished subsequently to this Authority in written form, 
furnished subsequently to this Authority in computer readble form. 



□ 



the statement that the sui>sequently furnished written sequence listing does not go beyond the disclosure in the 
international application as filed has been furnished. 

the statement that the information recorded in computer readable form is identical to the written sequence listing has t>een 
furnished 



2. 
3. 



PC] Certain claims were found unsearchable (See Box I). 
I I Unity of Invention Is lacking (see Box II). 



4. Witfi regard to the title, 

pr| the text is approved as submitted by the applicant. 

I I the text has been established by this Authority to read as follows: 



5. With regard to the abstract, 

|X] the text is approved as submitted by the applicant. 

□ the text has been established, according to Rule 38 2(b). by this Authority as it appears in Box III. The applicant may, 
wittiin one month from the date of mailing of this international search report, sutxnit comments to this Authority. 

6. The figure of the drawings to be published with the abstract is Figure No. 

I I as suggested by the applicant. [T] None of the figures. 

I I because the applicant failed to suggest a figure 

I I because this figure isetter characterizes the invention. 
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FURTHER INFORMATION CONTINUED FROM PCT/ISA/ 210 



Continuation of Box 1,1 

Although claim 31 is directed to a method of treatment of the 
human/animal body, the search has been carried out and based on the 
alleged effects of the compound/composition. 

Although claims 34 and 35 could be considered as a mere presentation of 
information. Rule 39.1 (v and vi ) PCT, the search has been carried out as 
far as possible in our systematic documentation 

Continuation of Box 1.2 



Present claims 20 27 29 -32 relate to compounds defined by reference to a 
desirable characteristic or property, namely modulating 
glycosyl transferases 

The claims cover all compounds having this characteristic or property, 
whereas the application provides support within the meaning of Article 6 
PCT and/or disclosure within the meaning of Article 5 PCT for only 
UDP-GlcNAc. In the present case, the claims so lack support, and the 
application so lacks disclosure, that a meaningful search over the whole 
of the claimed scope is impossible. Independent of the above reasoning, 
the claims also lack clarity (Article 6 PCT). An attempt Is made to 
define the compound by reference to a result to be achieved. Again, this 
lack of clarity in the present case is such as to render a meaningful 
search over the whole of the claimed scope impossible. Consequently, the 
search has been carried out for those parts of the claims which appear to 
be clear, supported and disclosed, namely those parts relating to the 
UDP-GlcNAc. 

The applicant's attention Is drawn to the fact that claims, or parts of 
claims, relating to inventions in respect of which no international 
search report has been established need not be the subject of an 
international preliminary examination (Rule 66.1(e) PCT). The applicant 
is advised that the EPO policy when acting as an International 
Preliminary Examining Authority is normally not to carry out a 
preliminary examination on matter which has not been searched. This is 
the case irrespective of whether or not the claims are amended following 
receipt of the search report or during any Chapter II procedure. 
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it Application No 
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A. CLASSIFlCATiON OF SUBJECT MATTER . 

IPC 7 C12N9/10 C12Q1/48 



A61K31/70 



According to International Patent Classification (IPC) or to both national dassification and IPC 



B. FIELDS SEARCHED 



Minimum documentation searched {classification system followed by classification symbols) 

IPC 7 C12N 



Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 



Electronic data base consulted during the international search (name of data base and. where practical, search terms used) 

BIOSIS, EPO-Internal 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Category " Citation of document with indcation, where appropriate, of the relevant passages 



Relevant to claim No. 



CHARNOCK SIMON J ET AL: "Structure of the 
nucl eotide-di phospho-sugar transferase, 
SpsA from Bacillus subtil is, in native and 
nucleotide-complexed forms." 
BIOCHEMISTRY, 

vol. 38, no. 20, 18 May 1999 (1999-05-18), 
pages 6380-6385, XP002152151 
ISSN: 0006-2960 
the whole document 



1-3,14, 
15,20, 
21,27, 
29,34,35 



18,19, 
22-26, 
28,30-32 



Further documents are listed in the continuation of box C. 



□ 



Patent family members are listed in annex. 



* Special categories of cited documents : 

"A" document defining the general state of the art which is not 

considered to lie of particular relevance 
"E' earlier document but putilished on or after the international 

filing date 

"L* document wNch may throw doubts on priority claim(s) or 
which is cited to estaWish the put3<ication date of another 
citation or other special reason (as specified) 

'O* document referring to an oral disclosure, use. exhibition or 
other means 

'P' document published prior to the international filing date txjt 
later than the priority date daimed 



T" later document published after the international filing date 
or priority date and not in conflict with the application but 
cited to understand the principle or theory underiying the 

invention 

'X' document of particular relevance: the claimed invention 
cannot be considered novel or cannot be considered to 
Involve an inventive step when the document is taken alone 

"Y" document of particular relevar>ce; the claimed invention 

cannot be considered to involve an inventive step when the 
document is combined with one or more other such docu- 
ments, such combination being obvious to a person skilled 
in the art. 

"&" document member of the same patent family 



Date of the actual completion of the international search 



8 November 2000 



Date of mailing of the intemational search report 



22/11/2000 



Name and mailing address of the ISA 

European Patent Office, P.B. 5818 Patentiaan 2 
NL - 2280 HV Rijswijk 
Tel. (+01-70) 340-2040. Tx. 31 651 eponl. 
Fax: (+31-70)340-3016 



Authorized officer 



VAN OER SCHAAL, C 
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Category " Citation of document, with indtcati on. where appropriate, of the relevant passages 



NISHIKAWA Y ET AL: "CONTROL OF 

GLYCOPROTEIN SYNTHESIS PURIFICATION AND 

CHARACTERIZATION OF RABBIT LIVER 

UDP-N-ACETYLGLUCOSAMINE 

ALPHA-3-D-MANNOSIDE BETA-1 2-N 

ACETYLGLUCOSAMINYLTRANSFERASE I " 

JOURNAL OF BIOLOGICAL CHEMISTRY, 

vol. 263, no. 17, 1988, pages 8270-8281, 

XP002152152 

ISSN: 0021-9258 

the whole document 



ULLMAN C AND PERKINS S: "A classification 

of nucleotide-diphospho-sugar 

glycosyl transferases based on amino acid 

sequence similarities" 

BIOCHEMICAL JOURNAL, 

vol. 326, 1997, pages 929-941, XP002152153 
page 937, paragraph 1 

DATABASE BIOSIS 'Online! 

BIOSCIENCES INFORMATION SERVICE, 

PHILADELPHIA, PA, US; 1994 

SARKAR MOHAN: "Expression of recombinant 

rabbi t UDP-Gl cNAc : al pha-3-D-mannos i de 

beta-1 , 2-N-acetyl gl ucosami nyl transferase I 

catalytic domain in Sf9 Insect cells." 

Database accession no. PREV199598004969 

XP002152154 

abstract 

& GLYCOCON JUGATE JOURNAL, 

vol. 11, no. 3, 1994, pages 204-209, 

ISSN: 0282-0080 

KUNTZ I D ET AL: "STRUCTURE-BASED 
MOLECULAR DESIGN" 

ACCOUNTS OF CHEMICAL RESEARCH, US, AMERICAN 

CHEMICAL SOCIETY. WASHINGTON, 

vol, 27, no. 5, May 1994 (1994-05), pages 

117-123, XP000885741 

ISSN: 0001-4842 

cited In the application 

the whole document 

GSCHWEND D A ET AL: "MOLECULAR DOCKING 
TOWARDS DRUG DISCOVERY" 

JOURNAL OF MOLECULAR RECOGNITION, GB,HEYDEN 
& SON LTD. , LONDON, 

vol, 9, 1996, pages 175-186, XP000882526 
ISSN: 0952-3499 
the whole document 
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FOR FURTHER see Notification of Transmittal of Intemationai Search Report 
ACTION PCT/lSA/220) as well as. where applicable, ftem 5 below. 


International application No. 


Intemationai filing date (day/month/year) 


(EarOest) Priority Date (itaf/monlh/yeart 


PCT/CA 00/00725 


16/06/2000 


18/06/1999 


Applicant 






RINI, James 







This Intemationai Search Report has been prepared by this International Searching Authority and is transmitted to the applicant 
according to Article 18. A copy is being transmitted to the Intemationai Bureau. 



. sheets. 



This Intemationai Search Report consists of a total of . 

Pn It is also accompanied by a copy of each prior art document cited in this report. 



1. Basis of the report 

a. With regard to the language, the intemationai search was carried out on the basis of the intemationai application in the 
language in which it was filed, unless otherwise indicated under this item. 

I I the intemationai search was carried out on the basis of a translation of the intemationai application furnished to this 
Authority (Rule 23.1 (b)). 

b. With regard to any nucleotide and/or amino acid sequence disclosed in the international application, the international search 
was carried out on the basis of the sequence listing : 

Pn contained in the international application in written fomti. 
I I ^iled together with the international application in computer readable form, 
furnished subsequently to this Authority in written form, 
furnished subsequently to this Authority in computer readble form. 



□ 

m 
m 



the statement that the subsequently furnished written sequence listing does not go beyond the disclosure in the 
international application as filed has been furnished. 

the statement that the information recorded in computer readable form is identical to the written sequence listing has been 
furnished 



2. 



fXl Certain claims were found unsearchable (See Box I). 
rn Unity of Invention Is lacking (see Box 11). 



4. With regard to the title, 

|X| the text is approved as submitted by the applicant. 

r~| the text has been established by this Authority to read as follows: 



With regard to the abstract, 

pn the text is approved as submitted by the applicant. 

j I the text has been established, according to Rule 38.2(b), by this Authority as it appears in Box II). The applicant may, 
' — ' within one month from the date of mailing of this international search report, submit comments to this Authority. 

The figure of the drawings to be published with the at)stract is Figure No. 

n as suggested by the applicant. [X] None of the figures. 

I I because the applicant failed to suggest a figure. 

I I because this figure better characterizes the invention. 
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A- CLAISJFICATjqN OF SUBJECT MATTER 

IPC 7 C12N9/10 C12Q1/48 A61K31/70 



I tlonal Application No 

CA 00/00725 



According to In temationai Patent Classification (IPC) or to both national classification and IPC 
B. HELDS SEARCHED 



Wnimum documemation searched (classification system followed by classification symbols) 
IrL / C12N 



n 



Documwitation seaiched other than minimum documentation to the extent that such documents are included in 



the fields searched 
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Although claim 31 is directed to a method of treatment of the 
human/animal body, the search has been carried out and based on the 
alleged effects of the compound/composition. 

Although claims 34 and 35 could be considered as a mere presentation of 
Information, Rule 39.1 (v and vi ) PCT, the search has been carried out as 
far as possible in our systematic documentation 

Continuation of Box 1.2 




Present claims 20 27 29 -32 relate to compounds defined by reference to a 
desirable characteristic or property, namely modulating 
gl ycosyl transferases 

The claims cover all compounds having this characteristic or property, 
whereas the application provides support within the meaning of Article 6 
PCT and/or disclosure within the meaning of Article 5 PCT for only 
UDP-GlcNAc. In the present case, the claims so lack support, and the 
application so lacks disclosure, that a meaningful search over the whole 
of the claimed scope is impossible. Independent of the above reasoning, 
the claims also lack clarity (Article 6 PCT). An attempt is made to 
define the compound by reference to a result to be achieved. Again, this 
lacx of clarity in the present case Is such as to render a meaningful 
search over the whole of the claimed scope Impossible. Consequently, the 
search has been carried out for those parts of the claims which appear to 
be clear, supported and disclosed, namely those parts relating to the 
UDP-GlcNAc. 

The applicant's attention is drawn to the fact that claims, or parts of 
claims, relating to inventions in respect of which no International 
search report has been established need not be the subject of an 
international preliminary examination (Rule 66.1(e) PCT). The applicant 
is advised that the EPO policy when acting as an International 
Preliminary Examining Authority Is normally not to carry out a 
preliminary examination on matter which has not been searched. This is 
the case irrespective of whether or not the claims are amended following 
receipt of the search report or during any Chapter II procedure. 
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1. [2 Claims Nos.: 

because they relate to subject matter not required to t>e searched by this Authority, namely: 

see FURTHER INFORMATION sheet PCT/ISA/2I0 



2. 2D Claims Nos.: 

because they relate to parts of the International Application that do not comply with the presaibed requirements to such 
an extent that no meaningful International Search can be carried out, specifically: 

see FURTHER INFORMATION sheet PCT/ISA/210 



3. Claims Nos.: 

because they are dependent daims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 

Box il Observations where unity of invention is lacicing (Continuation of item 2 of first sheet) 

This International Searching Authority found multiple inventions in this international application, as follows: 



1. I I As all required additional search fees were timely paid by the applicant, this International Search Report covers 
' — ' searchable claims. 



all 



2. I I As all searchable claims could be searched without effort justifying an additional fee. this Authority did not invite payment 
of any additional fee. 



3. I I As only some of the required additional search fees were timely paid by the applicant, this Intematlonal Search Report 
' — ' covers only those claims for which fees were paid, specifically claims Nos.: 



4. No required additional search fees were timely paid by the applicant. Consequently, this International Search Report Is 
restricted to the invention first mentioned in the claims: it is covered by claims Nos.: 



Remark on Protest | | The additional search fees were accompanied by the applicant's protest 

[ [ No protest accompanied the payment of additional search fees. 
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Non-establishment of opinion with regard to novelty, inventive step and industrial applicability 
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Reasoned statement under Article 35(2) with regard to novelty, inventive step or industrial applicability; 
citations and explanations suporting such statement 

Certain documents cited 

Certain defects in the international application 

Certain observations on the international application 
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18/01/2001 
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Form PCT/1 PEA/409 (cover sheet) (January 1994) 



INTERNATIONAL PRELIMINARY 
EXAMINATION REPORT 



Intemationat application No. PCT/CAOO/00725 



I. Basis of the report 

1 . With regard to the elements of the international application (Replacement sheets which have been furnished to 
the receiving Office in response to an invitation under Article 14 are referred to in this report as '^originally filed" 
and are not annexed to this report since they do not contain amendments (Rules 70. 16 and 70.17)): 
Description, pages: 

1 -245 as originaily filed 

Clainns, No.: 

1-35 as originally filed 

Drawings, sheets: 

1/53-53/53 as originally filed 

Sequence listing part of the description, pages: 

1 . as originally filed 

2. With regard to the language, all the elements marked above were available or furnished to this Authority in the 
language in which the international application was filed, unless otherwise indicated under this item. 

These elements were available or furnished to this Authority in the following language: , which is: 

□ the language of a translation furnished for the purposes of the international search (under Rule 23.1 (b)). 

□ the language of publication of the International application (under Rule 48.3(b)). 

□ the language of a translation furnished for the purposes of international preliminary examination (under Rule 
55.2 and/or 55.3). 

3. With regard to any nucleotide and/or amino acid sequence disclosed In the international application, the 
International preliminary examination was carried out on the basis of the sequence listing: 

IS contained in the international application in written form. 

□ filed together with the International application In computer readable form. 

□ furnished subsequently to this Authority in written form. 

S furnished subsequently to this Authority In computer readable form. 

IS The statement that the subsequently furnished written sequence listing does not go beyond the disclosure in 
the international application as filed has been furnished. 

la The statement that the information recorded in computer readable form is Identical to the written sequence 
listing has been furnished. 

4. The amendments have resulted in the cancellation of: 
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□ 



the description, 
the claims, 



pages: 



□ 



Nos.: 



□ 



the drawings, 



sheets: 



5. □ This report has been established as if (some of) the amendments had not been made, since they have been 

considered to go beyond the disclosure as filed (Rule 70.2(c)): 

(Any replacement sheet containing such amendments must t)e referred to under item 1 and annexed to this 
report.) 

6. Additional observations, if necessary: 

III. Non-establishment of opinion with regard to novelty, inventive step and industrial applicability 

1 . The questions whether the claimed invention appears to be novel, to involve an Inventive step (to be non- 
obvious), or to be industrially applicable have not been examined In respect of: 

□ the entire Intemational application. 

^ claims Nos. 20, 27, 29-32 all partially, 34, 35 completely. 



^ the said Intemational application, or the said claims Nos. 34 and 35 relate to the following subject matter 
which does not require an intemational preliminary examination {specif^: 
see separate sheet 

□ the description, claims or drawings {indicate particular elements beloW^ or said claims Nos. are so unclear 
that no meaningful opinion could be formed {specif^: 



□ the claims, or said claims Nos. are so Inadequately supported by the description that no meaningful opinion 
could be formed. 

S no intemational search report has been established for the said claims Nos. 20, 27, 29-32 all partially. 

2. A meaningful Intemational preliminary examination cannot be carried out due to the failure of the nucleotide 
and/or amino acid sequence listing to comply with the standard provided for In Annex C of the Administrative 
Instructions: 

□ the written form has not been furnished or does not comply with the standard. 

□ the computer readable form has not been fumished or does not comply with the standard. 



V. Reasoned statement under Article 35(2) with regard to novelty, inventive step or industrial applicability; 
citations and explanations supporting such statement 



because: 
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1. Statement 

Novelty (N) Yes: Claims 

No: Claims 1 -7, 1 1 . 1 3-1 6. 1 8-20, 27, 29 

Inventive step (IS) Yes: Claims 

No: Claims 1-33 

industrial applicability (lA) Yes: Claims 1 -30 

No: Claims 



2. Citations and explanations 
see separate sheet 



VIII. Certain observations on the international application 

The following observations on the clarity of the claims, description, and drawings or on the question whether the 
claims are fully supported by the description, are made: 
see separate sheet 
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For reasons of better comprehension, the present opinion does not stick to the usual 
division into points I. -VIII. 

Claims 1-10 and 14-19 are directed to two- or three-dimensional structures. The claims 
are therefore not directed to the product per se but only to the strucural information of 
the product. These claims are neither product claims nor method claims but appear to 
be mere presentation of information. The subject-matter of these claims is therefore, In 
principle, excluded from examination under Article 34(4)(a)(1) in combination with Rule 
67.1 (V) PCT. 

Nevertheless, the following comments on novelty, inventive step, and clarity have been 
made: 

Reference is made to the following documents: 

D1 : CHARNOCK SIMON J ET AL: 'Structure of the nucleotide-diphospho-sugar 
transferase, SpsA from Bacillus subtilis, in native and nucleotide-complexed 
forms.' BIOCHEMISTRY, vol. 38, no. 20, 18 May 1999 (1999-05-18), pages 
6380-6385 

D2: NISHIKAWA Y ET AL: 'CONTROL OF GLYCOPROTEIN SYNTHESIS 

PURIFICATION AND CHARACTERIZATION OF RABBIT LIVER UDP-N- 

ACETYLGLUCOSAMINE ALPHA-3-D-MANNOSIDE BETA-1 2-N 

ACETYLGLUCOSAMINYLTRANSFERASE V JOURNAL OF BIOLOGICAL 

CHEMISTRY, vol. 263, no. 17, 1988, pages 8270-8281 
D3: ULLMAN C AND PERKINS S: 'A classification of nucleotide-diphospho-sugar 

glycosy transferases based on amino acid sequence similarities' BIOCHEMICAL 

JOURNAL, voL 326, 1997. pages 929-941 
D4: KUNTZ I D ET AL: "STRUCTURE-BASED MOLECULAR DESIGN" ACCOUNTS 

OF CHEMICAL RESEARCH, US.AMERICAN CHEMICAL SOCIETY. 

WASHINGTON, vol. 27, no. 5, May 1994 (1994-05), pages 117-123 
D5: GSCHWEND D A ET AL: "MOLECULAR DOCKING TOWARDS DRUG 

DISCOVERY" JOURNAL OF MOLECULAR RECOGNITION,GB,HEYDEN & 

SON LTD., LONDON, vol. 9, 1996, pages 175-186 

1. Claims 20, 27, 29-32 have only been searched partially. Claims or parts of 
claims that have only been searched partially need not be the subject of an 
international preliminary examination, Rule 66.1(e) PCT. Thus, said claims are 
only examined with respect to UDP-GlcNAc. 

Claims 34 and 35 concem the mere presentation of information and are thus 
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excluded from examination according to Rule 67.1(v) PCT. 

2. Clarity and Novelty 

Claims 1-7, 11 , 13-16, and 18-33 do not fulfil the requirements of Article 6 PCT. 
The reasons for this lack of clarity are as follows: 
Product claims 1-7, 11 , 13-16, 18 and 19 directed to structures of 
glycosyltransferases (GT) do not define said structures at all or in a way that 
permits the man skilled in the art to clearly comprehend the subject-matter 
claimed. Said claims are so broad that they also are not novel over D1-3 which 
describe glycosyltransferases in solution or in crystal form as well as bound to 
substrates (Non-fulfilment of Article 33(2) PCT). 

All claims directed to modulators, inhibitors of GT or methods for obtaining such 
modulators or Inhibitors, namely claims 20, 22-33 are not supported by the 
description since, apart from the natural substrate UDP-GlcNAc, neither such 
compounds are disclosed nor are methods Indicated that allow for the obtention 
of such compounds (i.e. the applicants actually do not provide any experimental 
data. If on the other hand the general and vague description of the claimed 
methods is sufficient for the obtention of such a compound, then said methods 
and compounds would certainly not be inventive). Of course, this objections 
holds also for the medical use of such hypothetical modulators, as defined in 
claims 31-33. 

Claim 21 in its broadness lacks a recognisable technical problem and is therefor 

unclear. 

Product claims have to be novel as such. D2 describes UDP-GlcNAc as a ligand 
of GTn-1 . Thus, claims 20, 27. and 29 lack novelty (Article 33(2) PCT). 

3. Inventive step of claims 8-10, 12, 17, 21-26, 28, and 30-33 

Claims 8-10, 12 and 17 define structures of GT by atomic coordinates. However, 
it is not clear what technical effect could be associated with these claims. Thus, 
the underlying technical problem is the elucidation of the crystal structure of a 
GT. This is a scientific project, and not an Invention characterised by a technical 
effect. Since there is no technical problem defined, said claims cannot be 
inventive. 

Moreover, D2 describes the purification of the Identical enzyme of the present 
application. D1 discloses the crystal structure of the GT, SpsA, from B. subtilis. 
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D3 provides extensive analytical data on sequences and structure comparisons 
by sequence alignments of over 500 GT. The obtention of crystals and X-ray 
stmctures from a member of theses GT families is a trivial problem. 
As stated under point 2., claim 21 lacks a technical problem, and therefore an 
inventive step. 

Claims 22-26 do not contain any special technical feature (see also point 2.). 
Molecular modelling of protein ligands (e.g. inhibitors, substrates) and structure- 
based molecular design are well known in the prior art (e.g., D4 and D5). Thus, 
said claims are trivial. 

The features of the present claims 28 and 30-33 are either trivial or conventional 
in the art or within the competence of a skilled man seeking to improve the prior 
art processes mentioned in the search report and in the present opinion. 
Thus, all claims mentioned under this point are not inventive. 

4, For the assessment of the present claims 31-33 on the question whether they 

are industrially applicable, no unified criteria exist in the PCT Contracting States. 
The patentability can also be dependent upon the formulation of the claims. The 
EPO, for example, does not recognize as industrially applicable the subject- 
matter of claims to the use of a compound in medical treatment, but may allow, 
however, claims to a known compound for first use in medical treatment and the 
use of such a compound for the manufacture of a medicament for a new medical 
treatment. 
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Q^S_A^jgStl;T he invention relates to the three dimensional stnicnne of a glycosyltransferase. The atomic coordinates that define 
me sinictuie and any compounds bound to the stnicture oiable the detenmnation of the three dimensional stractmes of glycosyl- 
^ transferases with unknown stnicture, and die identification of modulators of a glycosyltransferase. 
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- 1- 

TITLE: Glycosyltransferases Structures 
FIELD OF THE INVENTION 

The invention relates to the secondary and three dimensional structures of glycosyltransferases. The 
atomic coordinates that define the structure and any compounds bound to the structure may be used to 
5 detemune glycosyltransferase homologues and the structures of polypeptides with unknown structure, and to 
identify modulators of glycosyltransfemses. 
HArKCTOTTND OF THE INVENTION 

The oligosaccharide chains of N- and O-linked glycc^roteins play a crucial role in a number of 
biological processes. Their biosyndiesis and degradation pathways are tiierefore aieas of significant interest 
10 for biology, medicine, and biotechnology. The assembly of the various types of oligosaccharides involves 
several glycosidases and glycosyltransferases. In con^arison with glycosidases, tbs mffrhanistng of which 
have been characterized in some detail, mechanistic investigations on glycosyltransferases have not yet 
undergone much scrutiny, although some kinetic studies have been reported, 

Glycosyltransferases are a diverse group of enzymes that catalyze the transfer of a single 

15 monosaccharide unit from a donor to the hydroxy! group of an acceptor saccharide. The acceptor can be either 
a free saccharide, glycoprotein, glycolqsid, or polysaccharide. The donor can be a nucleotide-sugar, or 
dolichol-phosphate-sugar. Glycosyltransferases show a precise specificity for bo^ the sugar acceptor and 
donor, and generally require the presence of a metal cofactor. 
SUMMARY OF THE INVENTION 

20 Broadly stated, the present invention relates to the secondary and three-dimensional structures of 

glycosyltransferases, and parts thereof. The glycosyltransferase structure may be the structure the enzyme 
takes up when it is associated wiUi one or more moieties (e.g. an acceptor, a sugar nucleotide donor, or 
con^ionents diereof). The invention also contemplates a glycosyltransferase structure comprising a secondary 
or three-dimensional structure of a glycosyltransferase in association with a moiety. The defined boundaries 

25 and properties of the structures and any of the moieties bound to it are pertinent to methods for determining 
the secondary or three-dimensional structures of polypeptides with unknown structure, and to methods that 
identify modulators of glycosyltransferases. These modulators are potentially useful as therapeutics for 
diseases, including (but not limited to) tumor growth, metastasis- of tumors, bacterial, viral, and parasitic 
infections, and inflammatory diseases such as rheumatoid arthritis, asthma, inflammatory bowel disease, and 

30 atherosclerosis. 

In an embodiment, the invention provides a crystalline form of a polypeptide corresponding to a 
glycosyltransferase, or a part thereof. The invention preferaisly contemplates a crystalline form a 
glycosyltransferase takes up when it is complexed with a moiety, including a nucleotide sugar donor, 
acceptor, metal cofactor, or heavy metal atom. The crystalline form may also comprise one or more heavy 
35 metal atoms, or at least one compound. A unit cell of the crystalline form of the invention may have 
dimensions of about a = 40.4 ± 3.0 A, b=82.4 ± 3.0 A, c = 102.5 ± 3.0 A. 

A glycosyltransferase structure of the invention may also be characterized by one or more of the 
following: 



SUBSTTTUTE SHEET (BULE 26) 
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(a) an N-teraiinal domain (amino acid residues 106-317 in Table 3) comprising an eight-stranded 
mixed p-sheet (pl-p8 in Figure 25) flanked by six helices (al-a6 in Figure 25) and a small two- 
stranded antiparallel p-sheet (p4' and in Figure 25); and 

(b) a C-terminal domain (amino acid residues 354-447 in Table 3) comprising a four-stranded mixed 
p-sheet (P9, piO, pi3, and pi4 in Figure 25) flanked by three a-helices (a7-a9 in Figure 25) 
and a short p-flnger (p 11 and p 12 in Figure 25). 

The N-terminal domain and C-terminal domain may be connected by a linker region (residues 331 to 
353 in Table 3) which wraps halfway around the N-terminal domain before starting the first helix of the C- 
terminal domain. 

The crystalline form may also be specifically characterized by the parameters, diffraction statistics 
and/or refinement statistics set out in Table 6. 

The invention also contemplates a secondary or three-dimensional structure (e.g. a crystalline form) of 
a domain of a glycosyltransferase. In accordance with one aspect, the invention contemplates a secondary or 
three-dimensional structure of a domain comprising an eight-stranded mixed p-sheet, flanked by six helices 
and a small two-stranded antiparallel P-sheet. The domain is also referred to herein as the "spsA GnT 1 core 
domain" or "SGC domain". In accordance with a preferred embodiment, the invention contemplates a domain 
comprising an eight-stranded mixed p-sheet represented as pl-p8 in Figure 25, flanked by six helices 
represented by al-a6 in Figure 25, and a small two-stranded antiparallel p-sheet represented by p4' and p8* 
in Figure 25. A secondary or three-dimensional structure of a polypeptide comprising an SGC domain of the 
invention is also within the scope of the invention. 

The invention further contemplates a loop structure of a glycosyltransferase. A loop structure may be 
characterized as the structure adjacent to the nucleotide-sugar donor binding site comprising amino acid 
residues 318-330 in Table 3. The loop structure may be fiirther characterized by amino acid residues 320-323 
forming a type IV turn and amino acid residues 324-330 making one complete turn of an a-helix. A secondary 
or three dimensional structure of a polypeptide comprising a loop structure of the invention is also within the 
scope of the invention. 

The invention also relates to a method of forming a crystalline form of the invention. 

The invention also features a method of determining secondary or three-dimensional structures of 
polypeptides with unknovwi structure comprising the step of applying the structural atomic coordinates of a 
crystalline form of a glycosyltransferase of the invention. 

The invention also provides a secondary or three-dimensional structure of a binding site of a 
glycosyltransferase. Binding sites include the binding sites for one or more of a disphosphate or pyrophosphate 
group of a sugar nucleotide donor, a nucleotide of a sugar nucleotide donor, a nitrogeneous heterocyclic base 
(preferably a pyrimidine base, more preferably uracil) of a sugar nucleotide donor, a sugar of the nucleotide of 
a sugar nucleotide donor, a selected sugar of a sugar nucleotide donor that is transferred to an acceptor, and/or 
an acceptor. The secondary or three-dimensional structure of a binding site may be defined by selected atomic 
contacts in the site. Thus, broadly stated the present invention provides a secondary or three-dimensional 
structure of a binding site of a glycosyltransferase defined by one or more atomic interactions or enzyme 



wo 00/78936 PCT/CAOD/00725 

-3- 

atomic contacts as set forth in Table 5. Each of the atomic interactions is defined in Table S by an atomic 
contact (more preferably, a specific atom where indicated) on the sugar nucleotide donor or acceptor, and an 
atomic contact (more preferably a specific atom where indicated) on the glycosyhransferase. 

The invention also relates to modulators derived from a secondary or three-dimensional structure of a 
glycosyltransferase, binding sites, atomic interactions^ or atomic contacts thereof, or a domain of a secondary 
or three-dimensional structure of a glycosyltransferase, including a SGC domain. Preferably, the modulators 
are derived from binding sites for a sugar nucleotide donor or parts thereof, an acceptor or parts thereof, 
including the SGC domain, and the binding site described herein as the loop structure. The invention provides 
inhibitors that are derived from a DxD motif, for example, peptides having the sequences as shown in Figures 
27 and 31 (SEQIDNOs 1-9). 

The present invention also contemplates a method of identifying a modulator of a glycosyltransferase, 
a binding site or a domain thereof, comprising the step of using the structural coordinates of a 
glycosyltransferase, binding sites, atomic interactions, or atomic contacts thereof, or domain thereof, to 
computationally evaluate a test compound for its ability to associate with the glycosyltransferase, binding site, 
or domain thereof. Use of the structural coordinates of a glycosyltransferase structure, binding sites, atomic 
interactions, or atomic contacts of the invention to identify a modulator is also provided. 

In an embodiment of the invention, a method is provided for identifying a modulator of a 
glycosytonsferase by determining binding interactions between a test compound and a binding site of a 
glycosyltransferase, or atomic interactions, or atomic contacts thereof, or a domain of a glycosyltransferase 
defined in accordance with the invention comprising: 

(a) generating the binding site, atomic interactions, atomic contacts, or domain on a 
computer screen; 

(b) generating a test compound with its spatial structure on the computer screen; and 

(c) testing to determine whether the test compound binds to the binding site, a selected 
number of atomic contacts, or the domain. 

Methods are also provided for identifying a potential modulator of a glycosyltransferase function by 
docking a computer representation of a compound with a computer representation of a structure of a 
glycosyltransferase or a part thereof, that is defined by the atomic structural coordinates, atomic interactions, or 
atomic contacts described herein. 

In an embodiment the method comprises the following steps: 

(a) docking a computer representation of a compound from a computer data base with a 
computer representation of a selected site (e.g. the sugar nucleotide donor or acceptor 
binding site, loop structure, or SGC domain) on a glycosyltransferase defined in accordance 
with the invention, to obtain a complex; 

(b) determining a conformation of the complex with a favourable geometric fit and favourable 
complementary interactions; and 

(c) identifying compounds that best fit the selected site as potential modulators of the 
glycosyltransferase. 
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In another embodiment the method comprises the following steps: 

(a) modifying a computer representation of a compound complexed with a selected site (e.g. 
sugar nucleotide donor or acceptor binding site, loop structure, or SGC domain) on a 
glycosyltransferase defined in accordance with the invention^ by deleting or adding a 
chemical group or groups; 

(b) determining a conformation of the complex with a fevourable geometric fit and favourable 
complementary interactions; and 

(c) identifying a compound that best fits the selected site as a potential modulator of a 
glycosyltransferase. 

In still another embodiment the method comprises the following steps: 

(a) selecting a computer representation of a compound complexed with a selected site (e.g. sugar 
nucleotide donor or acceptor binding site, loop structure, or SGC domain) on a 
glycosyltransferase defined in accordance with the invention; and 

(b) searching for molecules in a data base that are similar to the compound using a searching 
computer program, or replacing portions of the compound with similar chemical structures 
from a data base using a compound building computer program. 

A compound that interacts with a glycosyltransferase, binding sites or atomic contacts thereof, or a 
domain thereof, identified using a method of the invention may be used as a modulator of any 
glycosyltransferase or composition bearing the interacting binding site, atomic contacts, or domain. Therefore, 
the invention features a modulator of a glycosyltransferase identified by a method of the invention. 

The invention further contemplates classes of modulators of glycosyltransferases based on the three- 
dimensional structure of a sugar nucleotide donor, or component thereof, or acceptor, defined in relation to the 
sugar nucleotide donor's or acceptor's spatial association with a glycosyltransferase structure. Generally, a 
method is provided for designing potential inhibitors of a glycosyltransferase comprising the step of using the 
structural coordinates of a sugar nucleotide donor or acceptor or component thereof, defined in relation to it 
spatial association with the glycosyltransferase structure or a binding site thereof, to generate a compound that 
is capable of associating with the glycosyltransferase or binding site thereof. 

It will be appreciated that a modulator of a glycosyltransferase may be identified by generating an 
actual secondary or three-dimensional models of a bindmg site, synthesizing a compound, and examining the 
components to find whether the required interaction occurs. 

A potential modulator of a glycosyltransferase identified by a method of the present invention may be 
confirmed as a modulator by synthesizing the compound, and testing its effect on the glycosyltransferase in an 
assay for that glycosyltransferase's enzymatic activity. Such assays are known in the art. 

A modulator of the invention may be converted using customary methods into pharmaceutical 
compositions. A modulator may be formulated into a pharmaceutical composition containing a modulator 
either alone or together with other active substances. 

Therefore, the methods of the invention for identifying modulators may comprise one or more of the 
following additional steps: 
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(a) testing whether the modulator is a modulator of the activity of a glycosyltransferase, preferably 
testing the activity of the modulator in cellular assays and animal model assays; 

(b) modifying the modulator; 

(c) optionally rerunning steps (a) or (b); and 

(d) preparing a pharmaceutical composition comprising the modulator. 

Steps (a), (b) (c) and (d) may be cairied out in any order, at different pomts in time, and they need not be 
sequential. 

The invention also contemplates a method of treating a disease associated with a glycosyltransferase 
with inappropriate activity in a cellular organism, comprising: 

(a) administering a modulator of the invention in an acceptable pharmaceutical preparation; and 

(b) activating or inhibiting a glycosyltransferase to treat the disease. 

The invention provides for the use of a modulator identified by the methods of the invention in the 
preparation of a medicament to treat a disease associated with a glycosyltransferase with inappropriate activity 
in a cellular organism. Use of the structural coordinates of a glycosyltransferase structure of the invention to 
manufacture a medicament is also provided. 

Another aspect of the invention provides machine readable media encoded with data representing the 
coordinates of the secondary or three dimensional structure of a glycosyltransferase, binding sites or atomic 
contacts thereof, or domain as defined herein, or the three dimensional structure of a sugar nucleotide donor or 
acceptor defmed in relation to its spatial association with a glycosyltransferase structure as defined herein. The 
invention also provides computerized representations of the secondary or three-dimensional structures of the 
invention, including any electronic, magnetic, or electromagnetic storage forms of the data needed to defme the 
structures such that the data will be computer readable for purposes of display and/or manipulation. 

These and other aspects of the present invention will become evident upon reference to the following 
detailed description and attached drawings. 
DESCRIPTION OF THE DRAWINGS 

The invention will now be described in relation to the drawings in which: 

Figure 1 is a secondary structure diagram of GnT-1, as viewed along the beta sheet, fi-om strand "bS." 
Note the eight-stranded beta sheet twist in the foreground, and the four-stranded beta sheet, offset in the 
background. 

Figure 2 is a secondary structure diagram of GnT-1, showing a view fi-om the side. The first domain 
is a mixed eight-stranded beta sheet, backed by alpha helices, indicated by "b" for the beta strands and "a" for 
the alpha helices. The second domain is a mixed four-stranded beta sheet, again backed by helices, and 
indicated with capital "B" and "A," respectively. 

Figure 3 is a sample of experimental MAD MeHg-derivative GnT-I density, from the bottom of the 
active site pocket. The Hg position was identified using SOLVE. SHARP was used to refine the Hg 
parameters, and CCP4 dm was used for solvent-flattening and histogram matching, giving the shown map. 

Figure 4 is a hydrophobic surface diagram of the top and bottom of GnT-1, with hydrophobic regions 
in green. Note the patch in the pocket, as well as at the base of the alpha-helix "tower." 
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Figure 5 is an electrostatic surface diagram of the top and bottom of GnT-1, with acidic regions in 
red, and basic regions in blue. Note the large acidic patch to one side of the active site pocket. 

Figure 6 is a conservation diagram of the active site pocket of GnT-1. Conserved regions are 
indicated in red, with "A" being fully conserved, and "0" unconserved. Alignments of active GnT-1 's (rabbit, 
5 human, rat, mouse, golden hamster, Chinese hamster, C. eiegans 1 #1 and #3, and frog) was performed using 
CLUSTALX, and conservation was calculated using AMAS. Note the highly conserved active site pocket 

Figure 7 is a worm diagram of the GnT-I structure with secondary structure shown. Beta strands are 
shown as arrows, and alpha helices are helices. UDP-GIcNAc and the Mji^* ion are shown in the binding site. 

Figure 8A through 8F are surface diagrams of the GnT-1 structure. UDP-GlcN Ac and the Mn^* ion are 
10 shown in the binding site. (8A) The phosphate-binding loop lid, which forms upon UDP-GlcNAc binding, is 
shown as a worm. (8B) The loop is shown as a surface. (8C) The surface has been colored according to 
potential. Basic potential is shown in blue, and acidic potential is shown in red; the loop is shown as a worm. 
(8D) As in 8C, but with the loop shown as a sur&ce. (8E) The surface has been colored accordmg to residue 
AMAS conservation index. Red regions are conserved, white are unconserved; the loop is shown as a worm. 
1 5 (8F) As in 8E, but with the loop shown as a surface. 

Figure 9 are diagrams showing the active site of the GnT-1 enzyme. Asp291 is shown as a stick figure 
on the left side of the pocket, while the rest of the protein is shown as a sur&ce. UDP-GlcNAc is shown as a 
stick figure on the right Mn^* has been shown as a sphere. (9A) The loop is shown as a worm. (9B) The loop 
is shown as a sur&ce. Note the mannose-sized active site pocket. 
20 Figure 10 is a surface diagram of GnT-1 bound to the model of the MansGlcNAc2 acceptor. UDP- 

GlcNAc is shown as a stick, and the Mn^^ has been shown as a sphere. (IDA) The acceptor model is shown as a 
stick figure. (lOB) The acceptor has been shown as a space-filling van der Waals figure. 

Figure 11 is the same as Figure 10 but from a different angle, showmg the fit of the acceptor to the 
surface more visible. (1 lA) The acceptor has been shown as a stick figure, and the loop as a worm. (1 IB) The 
25 acceptor has been shown as a stick figure, and the loop as a sur&ce. (1 IC) The acceptor has been shown as 
space-filling van der Waals spheres, and the loop as a surface. (1 ID) As in 1 IB, but with the surface colored 
according to residue conservation index. Note the correlation of the acceptor model to red conserved residues. 

Figure 12 shows a model of the active site of GnT-1, with the base D291 (i.e. Asp292), the a- 1,3 
mannose 02, and the GlcNAc CI joined by lines of small spheres. The protein backbone has been shown as an 
30 alpha-carbon trace, the acceptor MansGlcNAc: sugar, UDP-GlcNAc, and protein side-chains have been shown 
as stick figures, and the Mn^"^ ion and bound water molecules have been shown as spheres. 

Figure 13 shows a model of the overlay of GnT-1 (red). Bacillus subtilis nucleotide- diphospho-sugar 
transferase (spsA) (green), Escherichia co// N-acetylglucosamine 1 -phosphate uridyltransferase (GlmU) (blue), 
and bovine P- 1 ,4-galactosyItransferase Tl (galT) (cyan). Parts of the protein sequence not in the transferase 
35 fold are shaded a darker color. 

Figure 14 shows the overlay of GnT-1 and GImU from the model of Figure 13. The DALI z-score (a 
measure of structural similarity) for this overlay is 9.6. Dissimilar structures give scores less than 2; greater 
similarity gives a higher score. 
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Figure 15 shows the overlay of GnT-1 and p-l,4-galT from the model of Figure 13. The DALI z- 
score for this overlay is 10.6. 

Figure 16 shows the overlay of GnT-1 and spsA from the model of Figure 13. The DALI z-score for 
this overlay is 15.7. 

Figure 17 shows the model of Figure 13 from a different angle. Note the overlay of the helix-loop- 
helix containing the catalytic base Asp residue (Asp 291 in GnT-1. Asp 191 in spsA, and Asp in galT). 
Figure 18 shows the overlay of GnT-1 and GhnU from the model of Figure 17, 
Figure 19 shows the overlay of GnT-1 and galT from the model of Figure 17. 
Figure 20 shows the overlay of GnT-1 and spsA from the model of Figure 17. 

Figure 21 shows the secondary structure of GnT-1. Helices are in red and p sheets are in green. 
Areas not in the conserved fold are darkened. 

Figure 22 shows the secondary structure of GhnU. Helices are in red and P sheets are in green, p 
strand 6 has been deleted. 

Figure 23 shows the secondary structure of galT. Helices are in red and P sheets are in green, p 
strand 3 has been deleted along with helix 2 leading into it. Instead, a small p fmger N-terminal of the core 
domain and a p fmger C-terminal of the core domain occupy the space of p strand 3. 

Figure 24 shows the secondary structure of spsA. Helices are in red and p sheets are in green. All 
eight strands are present ui the core domain. 

Figure 25 is a GnT I Ribbon Diagram. Domain 1 is shown in cyan, the loop (residues 318 to 330) 
structured upon UDP-GlcNAc binding in red, the linker connecting Domain 1 and Domain 2 in green, Domain 
2 in brown, and the UDP-GIcNAc and the Mn** ion are shown in yellow. All molecular images were prepared 
usmg SPOCK (Christopher, 1998) and rendered using Raster3D (Bacon, 1988 ; Merritt, 1994 ). 

Figure 26 shows the electrostatic potential surface of GnT I, showing the acidic pocket into which the 
Mn^* ion and UDP-GlcNAc bind. Acidic residues are colored red, and basic residues blue, with a gradient 
through ± 10 kT. The UDP-GlcNAc is shown in yellow. 

Figure 27 shows a sample of the AMAS analysis. Shown is an excerpt from the AMAS analysis, with 
residues in the region of the "DxD" motif (residues 211 to 213, EDD). GnT I sequences from rabbit human, 
mouse, rat, Chinese hamster, golden hamster, frog, and C.elegans genes gIy-12 and gly-14, were aligned using 
ClustalX, and conservation was scored using AMAS, Unconserved residues are given a score of "0", and fully 
conserved residues are given a score of "A". (SEQ ID NO 1, 2, 3, 4, and 5). 

Figure 28 shows AMAS surface analysis. AMAS residue scores, as shown in Figure 27, were then 
mapped onto the proteui surface, with a gradient from green for a completely unconserved score of 0, to white 
for an AMAS score of 5, to red for a fully conserved score of "A". 

Figure 29 shows a stereo ribbon overlay of the SGC domains of GnT 1 (red) and spsA (green). For 
clarity only the a-helices are labeled. UDP (spsA) and UDP-GlcNAc (GnTl) are shown in stick representation. 
M and C label the side chains of the metal binding and catalytic aspartic acid residues also shown in stick 
representation 

Figure 30 shows topology diagrams of GnT I, spsA. GImU (an N-acetylglucosamine-1 -phosphate 
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uridyltransferase from Escherichia co/i) and p4Gal-Tl (a bovine galactosyl transferase, p.1,4- 
gaiactosyltransferase 1), Beta strands are shown as green triangles, and alpha helices as red circles, with 
missing elements shown in white. The secondary structural elements are labeled as in GnT I. The boxed gray 
region corresponds to the SGC domain.. 

Figure 31 shows a structural alignment of GnT I, spsA, GlmU and p4Gal-Tl. (SEQ ID NO 6, 7, 8, 
and 9). Shown are two excerpts from the complete alignment, numbered according to rabbit GnT I. In the top 
alignment, the region around the DxD motif is shown, with the motif highlighted in magenta. In the bottom 
alignment, the area around the catalytic Asp is shown, with the catalytic residue again highlighted in magenta. 
Note that GImU is not a glycosyltransferase, but rather an N-acetylglucosamine-1 -phosphate uridyl transferase, 
so it does not share the catalytic residue found in GnT I, spsA, and P4GaI-Tl . 

Figure 32 shows the GnT f substrate binding site. All interactions between the protein, the UDP- 
GIcNAc, the Mn^* ion, and structured waters are shown as lines composed of small white spheres. 

Figure 33A, B, and C show a stereo view of die UDP-GlcNAc/Mn^"^ binding site. Carbon, oxygen, 
nitrogen, sulfur, and phosphorus are colored white, red, blue, yellow and purple respectively; water molecules 
are cyan and the Mn^"^ ion is salmon. Hydrogen bonds are shown as dotted lines. The CI of the 
acetylglucosamine moiety is labeled for reference. 33A Uracil and ribose interactions; 33B) Mn'" and 
phosphate interactions; 33C) AT-acetylglucosamine interactions. 

Figure 34 shows interactions between GnT 1, the Mn ^* ion. and the UDP-GlcNAc phosphates. Rl 17 
is fix)m the N-terminus of helix al, E21 1 and D213 are from the C-terminus of strand p4, T315 and G317 are 
from strand pS' and the N-terminus of the loop lid, and V32 1 and S322 are from the tip of the loop lid. 

Figure 35 shows interactions between GnT I and the GlcNAc group of UDP-GlcNAc. Residue Y184 
is in helix ot3. residue E21 1 is in strand |J4, residue L269 is from the C-terminus of strand P7, residues F289, 
W290, D291 and R295 are from helix a6, and L331 is from the C-terminal end of the loop lid. D291 is the 
only Asp diat is close enough to the GlcNAc CI to act as the catalytic base. 

Figure 36 shows GnT I overlaid on spsA: GnT I appears in red, and spsA in green. In this Figure, the 
position of the ligands is shown. GnT I is bound to UDP-GlcNAc, shown as a red stick figure, along with a 
Mn^* ion, shown as a red sphere near Asp213; spsA is bound to UDP, shown as a green stick figure, along with 
a Mn^* ion, shown as a green sphere near Asp 99. Note how the nucleotides and proteins overlay very closely. 
The catalytic base residue in GnT I, Asp 291, identified by this structure, has an analogous residue in spsA, 
Asp 191 . This predicts that Asp 191 is the catalytic base in spsA. The catalytic base was not identifiable with 
the spsA x-ray crystal structure alone, due to the absence in the spsA structure of the sugar residue normally 
attached to the UDP. 

Figure 37 shows GnT I overlaid on pi. 4-galT: GnT I appears in red, and P-l,4-galT in cyan. Again, 
the ligands of GnT I are shown, as in Figure 36. The ligand in the p-1.4-galT x-ray crystal structure, UDP, is 
shown as a stick figure; the Mn^+ normally required in the reaction is absent, as is the sugar part of the donor 
sugar-nucleotide. Again, GnT I's Asp 291 has an analogous galT residue, Asp3l8. This predicted to be the 
catalytic base in P-l,4-galT by the GnT I structure. 

Figure 38 shows GnT I overlaid on GlmU: GnT I appears in red. and GlmU in navy blue. The ligands 
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of GnT I are shown in red, as in Figures 36 and 37. The GlmU product, UDP-GlcNAc, is shown as a navy 
blue stick figure. As GlmU is not a transferase, these two enzymes do not catalyse the same reaction, and thus 
the residues involved in enzymatic action are expected to be different. However, the similar fold and sunilar 
location of sugar-nucleotide binding, suggests that these enzymes may have evolved from an extremely distant 
common ancestor. 

Figure 39 shows the DxD Motif. Atom colors and labels are as in Figure 33. Letters i, to i+3 
correspond to the residues of the type 1 p-tum. The hydrogen bond characteristic of this turn type is shown in 
green. 

Figure 40A and 40B shows a stereo diagram of the structured loop and the acceptor bmding pocket. 
Atom colors and labels as in Figure 33. Backbone tubes and molecular surfaces are color coded as follows: 
red, structured loop; green, linker region; cyan, Domain 1 ; brown, Domain 2. 40A) Structured loop and UDP- 
GlcNAc/Mn^* interactions; 40B) Surface representation of the acceptor binding pocket. The side chain of the 
catalytic base (D291) and the A^-acetylglucosamine moiety of the UDP-GlcNAc are seen at the base of the 
pocket. 

DETAILED DESCRIPTION OF THE INVENTION 
Summary of Tables 1 to 8 

Table 1- structural coordinates of an N-acetylglucosaminyl transferase I (GnT-1) native structure. 
Table 2 -structural coordinates of a GnT-1 with bound MeHg . 

Table 3 -structural coordinates of a rabbit GnT-1 bound to UDP-GlcNAc and a manganese 2+ ion. 

Table 4 - structural coordinates of a GnT-1 with acceptor. 

Table 5 - Intermolecular Contacts of GnT-l-UDP-GlcNAc Complex. 

Table 6 - crystallographic data and refuiement statistics. 

Table 7 - The UDP-GIcNAc binding site. 

Table 8 - Protein threading results. 

In Tables 1 to 4, from the left, the second column identifies the atom number; the third identifies the 
atom type; the fourth identifies the amino acid type; the fifth identifies the residue number; the sixth identifies 
the x coordinates; the seventh identifies the y coordinates; the eighth identifies the z coordinates; the ninth 
identifies the occupancy; and the eleventh identifies the temperature fector. 
Definitions: 

Unless otherwise indicated, all terms used herein have the same meaning as they would to one skilled 
in the art of the present invention. Practitioners are particularly directed to Current Protocols in Molecular 
Biology (Ansubel) for definitions and terms of the art. 

"Glycosyltransferase structure" or "glycosyltransferase secondary or three-dimensional structure" 
refers to the three-dimensional structure (i.e. tertiary structure) or arrangement of secondary structural elements 
of a purified polypeptide comprising a glycosyltransferase. A glycosyltransferase structure may be in 
association with or complexed with a moiety including a heavy metal atom or metal cofactor. A 
glycosyltransferase structure may be in crystalline form. 
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The tenn "crystalline form" in the context of the invention, is a crystal formed from an aqueous 
solution comprising a purified polypeptide comprising a glycosyltransferase. The glycosyltransferase is , 
preferably a glycosyhransferse with an SGC domain, including but not limited to a glycosyltransferase 
structurally related to N-acetylglucosaminyltransferase I to VIII, preferably N-acetylglucosaminyltransferase I. 
5 A crystalline form of a glycosyltransferase, is characterized as being capable of diffracting x-rays in a pattern 
defined by one of the crystal forms depicted in Blundel et al 1976, Protein Crystallography, Academic Press. A 
crystalline form may include a crystal structure in association with one or more moieties, including heavy- 
metal atoms i.e. a derivative crystal, or one or more compounds i.e. a co-crystal. 

The term "associate", "association" or "associating" refers to a condition of proximity between a 
10 moiety (i.e. chemical entity or compound or portions or fragments thereof), and a glycosyltransferase, or parts 
or fragments thereof (e.g. binding sites or domains). The association may be non-covalent i.e. where the 
juxtaposition is energetically fevored by for example, hydrogen-bonding, van der Waals, or electrostatic or 
hydrophobic interactions, or it may be covalent 

The term "heavy-metal atoms" refers to an atom that can be used to solve an x-ray crystallography 
15 phase problem, including but not limited to a transition element, a lanthanide metal, or an actinide metal. 
Lanthanide metals include elements with atomic numbers between 57 and 71, inclusive. Actinide metals 
include elements with atomic numbers between 89 and 103, inclusive. 

A "metal cofactor" refers to a metal ion required for a glycosyltransferase to transfer the selected 
sugar from the sugar nucleotide donor to the acceptor. For example, the metal cofactor for N- 
20 acetylglycosyltransferase may be a divalent cation like manganese, or magnesium, and other similar atoms or 
metals. 

The term "glycosyltransferase" refers to an enzyme that catalyzes the transfer of a single 
monosaccharide unit from a donor to the hydroxyl group of an acceptor substrate. The acceptor can be either a 
free saccharide, glycoprotein, glycolipid, or polysaccharide. The donor can be a nucleotide-sugar, or dolichol- 

25 phosphate-sugar. Glycosyltransferases show a precise specificity for both the sugar acceptor and donor and 
generally require the presence of a metal cofactor. The term "glycosyltransferase" also encompasses 
polypeptides comprising a SGC domain. 

Glycosyltransferases include but are not limited to eukaryotic glycosyltransferases involved in the 
biosynthesis of glycoproteins, glycolipids, glycosylphosphatidylinositols and other complex glycoconjugates, 

30 and prokaiyotic glycosyltransferases involved in the synthesis of carbohydrate structures of bacteria and 
viruses, including enzymes involved in LOS and lipopolysaccharide biosynthesis. Examples of 
glycosyltransferases include N-acetylglucosaminyltransferases, including N-acetylglucosaminyltransferases I 
through VIII involved in the biosynthesis of complex and hybrid N-glycans; UDP-N-acetylglucosamine:N- 
acetyl galactosamine p 1 ,6-N-acetylglucosaminyl transferases (core 2 GlcNAc transferases); Core 3 GlcNAc 

35 transferase, Core 4 GlcNAc transferase; Corel and Core 2 elongation glycosyltransferases involved in the 
biosynthesis of O-glycans and the glycosyltransferases involved in the biosynthesis of antigen determinants 
(blood group i and blood group I); and structurally related proteins. 

The enzyme at the gateway from high-marmose structures to hybrid and complex A^-glycans is UDP- 
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yV-acetylglucosamine:a-3-D-inannoside |3-l,2-/y-acetyIgIucosaminyltransferase I (GnT I; E.G. 2.4.1.101; 
Harpaz and Schachter. 1980; Narasimhan et al, 1977; Stanley et al. 1975; See GenBank M61829 and M55621 
(human) and M57301 (rabbit) for nucleic acid and amino acid sequences]. It transfers the first N- 
acetylglucosamine residue onto the high-mannose core and all other enzymes in the hybrid and complex 
pathway depend on its prior action (Schachter, 1986; Schachter, 1991). GnT I plays a fundamental role in 
mammalian development, as shown by knockout studies in mice (loffe, 1994; Metzler, 1994 ). Moreover, 
mutation or misregulation of several of the enzymes dependent on GnT I action are associated with human 
disease and metastasis (Jaeken et al, 1994,; Chanik et al, 1995; Jaeken et al , 1993; Granovsky et al, 2000; Tan 
etal, 1996). 

Glycosyltransferases have been classified into 44 different families, based on both sequence similarity 
and substrate/product stereochemistry (inverting or retaining) (Campbell et al, 1997; Campbell et al, 1998; 
Coutinho and Henrissat, 1999). GnT I (femily 13) is an inverting glycosyltransferase: the a-linked GlcNAc 
moiety from the UDP-a-GlcNAc donor is transferred to the 3-arm of the ManjGlcNAca acceptor, creating the 
p-linked GlcNAc-p-l,2-Man-R product (Reck et al, 1994). 

As applied to polypeptides, the term "substantia] sequence identity" means that two peptide 
sequences, when optimally aligned, such as by the programs GAP or BESTFIT using default gap, share at least 
40%, 50%, 60%. 65%, 70%, 75%, 80%, or 85% sequence identity, preferably at least 90 percent sequence 
identity, more preferably at least 95 percent sequence identity or more. Preferably, residue positions which are 
not identical differ by conservative amino acid substitutions. For example, the substitution of amino acids 
having similar chemical properties such as charge or polarity are not likely to effect the properties of a protein. 
Examples include glutamine for asparagine or glutamic acid for aspartic acid. 

The term "mutant" refers to a polypeptide that is obtained by replacing at least one amino acid residue 
in a native glycosyltransferase with a different amino acid residue. Mutation can also be accomplished by 
adding and/or deleting amino acid residues within the native glycosyltransferase or part thereof A mutant may 
or may not be functional. 

The term "function" refers to the ability of a modulator to enhance or inhibit the association between 
a glycosyltransferase and a compound, or the activity of the glycosyltransferase. 

"Modulator** refers to a molecule which changes or alters the biological activity of a 
glycosyltransferase. A modulator may increase or decrease glycosyltransferase activity, or change its 
characteristics, or fimctional or immunological properties. It may be an inhibitor that decreases the biological 
or inmiunological activity of the protein. Modulators include but are not limited to peptides, members of 
random peptide libraries and combinatorial chemistry-derived molecular libraries, phosphopeptides (including 
members of random or partially degenerate, directed phosphopeptide libraries), antibodies, carbohydrates, 
nucleosides or nucleotides or parts thereof, and small organic or inorganic molecules. A modulator may be an 
endogenous physiological compound, or it may be a namral or synthetic compound. 

The term "atomic structural coordinates" or "structural coordinates" as used herein refers to a data set 
that defines the three dimensional structure of a molecule or molecules (e.g. Cartesian coordinates, temperature 
factors, and occupancies). Structural coordinates can be slightly modified and still render nearly identical three 
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dimensional structures. A measure of a unique set of structural coordinates is the root-mean-square deviation 
of the resulting structure. Structural coordinates that render three dimensional structures ( in particular a three 
dimensional structure of an SGC domain) that deviate from one another by a root-mean-square deviation of 
less than 5 A, 4 A, 3 A, 2 A, or 1 .5 A may be viewed by a person of ordinary skill in the art as very similar. 

The term '"unit cell" refers to the smallest and simplest volume element (i.e. parallelpiped-shaped 
block) of a crystal that is completely representative of the unit of pattern of the crystal. The unit cell axial 
lengths are represented by a, b, and c. Tliose of skill in the art understand that a set of atomic coordinates 
determined by X-ray crystallography is not without standard error. 

The term "space group" refers to the lattice and symmetry of the crystal. In a space group designation 
the capital letter indicates the lattice type and the other symbols represent symmetry operations that can be 
carried out on the contents of the asymmetric unit without changing its appearance. 

The term **purified*' in reference to a polypeptide, does not require absolute purity such as a 
homogenous preparation rather it represents an indication that the polypeptide is relatively purer than in the 
natural enviroimient Generally, a purified polypeptide is substantially free of other proteins, lipids, 
carbohydrates, or other materials with which it is naturally associated, preferably at a functionally significant 
level for example at least 85% pure, more preferably at least 95% pure, most preferably at least 99% pure. A 
skilled artisan can purify a polypeptide comprising a glycosyltransferase using standard techniques for protein 
purification. A substantially pure polypeptide comprising a glycosyltransferase will yield a single major band 
on a non-reducing polyacrylamide gel. The purity of the glycosyltransferase can also be determined by amino- 
terminal amino acid sequence analysis. 

A "sugar nucleotide donor" refers to a nucleotide coupled to a selected sugar that is transferred by a 
glycosyltransferase to an acceptor. The selected sugar may be a monosaccharide. A suitable selected sugar 
includes N-acetyl glucosamine (GlcNAc). The N-acetyl glucosamine may be modified for example, the 
hydroxyls may be blocked with acetonide, acylated, or alkylated or substituted with other groups such as 
halogen. For N-acetylglucosaminyltransferases the nucleotide is preferably UDP. For other en2ymes, the 
nucleotide may be GDP (fucosyltransferases and mannosyltransferases). or CNfP (sialyltransferases). The 
heterocyclic amine base in the nucleotide may be modified. For example, when the base is uridine it may be 
modified at the C-5 or C-6 position with groups including but not limited to alkyl, aiyl, and electron donating 
and electron withdrawing groups. The sugar in the nucleotide (e.g. ribose) may be modified at the 2' or 3' 
position with groups including but not limited to alkyl, aryl, and electron donating and electron withdrawing 
groups. 

"Acceptor*' refers to the part of a carbohydrate structure (e.g. glycoprotein, glycolipid) where the 
selected sugar is transferred by the glycosyltransferase. The acceptor may comprise ManjGlcNAci-. 

Abbreviations for amino acid residues are the standard 3-letter and/or 1 -letter codes used in the art to 
refer to one of the 20 common L-amino acids. 
Glycosyltransferase Structures 

The present invention provides a secondary or three-dimensional structure of a glycosyltransferase or 
part thereof (e.g. binding site or domain). In an embodiment the structure is a crystalline form. A 
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glycosyltransferase structure may comprise a glycosyltransferase in a unit cell. In an embodiment, a 
glycosyltransferase is arranged in a crystallline manner in a space group P2i2i2| so as to form a unit cell of 
dimensions a= 40.4 ± 3.0 A, b= 82.4 ± 3.0 A, c= 102.5 ± 3.0 A, a'=P=^ 90**, and which effectively diffracts 
X-rays for determination of the atomic coordinates of a glycosyltransferase. The secondary and three- 
5 dimensional structure of a preferred glycosyltransferase of the invention is illustrated by the N-acety! 
glucoszuninyl transferase I (GnT 1) structure specifically described herein. A glycosyltransferase structure may 
be defined by the structural coordinates of Tables 1 , 2, 3, or 4. 

A glycosyltransferase structure includes the secondary or three-dimensional structure of a native 
glycosyltransferase, a derivative glycosyltransferse, or a mutant glycosyltransferase. Thus, a crystalline form 

10 includes native crystals, derivative crystals, and co-crystals. The crystals generally comprise a substantially 
pure glycosyltransferase in crystalline form. It is understood that the glycosyltransferase structures of the 
invention are not limited to a naturally occurring or native glycosyltransferases but include polypeptides 
comprising an SGC domain, or polypeptides with substantial sequence identity to a glycosyltransferase. A 
glycosyltransferase structure also includes mutants of a native glycosyltransferase obtained by replacing at 

IS least one amino acid residue in a native glycosyltransferase with a different amino acid residue, or by adding or 
deleting amino acid residues within the native polypeptide, and having substantially the same secondary or 
three-dimensional structure as the native glycosyltransferase from which the mutant is derived i.e. having a set 
of atomic structural coordinates that have a root mean square deviation of less than or equal to about 5, 4, 3, 2, 
or 1.5 A when superimposed with the atomic structure coordinates of the native glycosyltransferase firom wiiich 

20 the mutant is derived when at least 50% to 100% of the atoms of the native glycosyltransferase domain are 
included in the superimposition. It should be noted that the glycosyltransferase structures contemplated herein 
need not exhibit glycosyltransferase activity. 

A derivative glycosyltransferase structure of the invention comprises a glycosyltransferase structure in 
association with one or more moieties that are heavy metal atoms. For exzunple, derivative crystals of the 

25 invention generally comprise a crystalline glycosyltransferase in covalent association with one or more heavy 
metal atoms. The glycosyltransferase may correspond to a native or mutated glycosyltransferase. Heavy metal 
atoms useful for providing derivative glycosyltransferase structures include by way of example, and not 
limitation, gold, mercury, etc. 

The invention features a glycosyltransferase structure in association with one or more moieties that 

30 are compounds (e.g. UDP-GlcNAc, uridine-ribose, phosphate-Mn^, ManjOlcNAC}-, one or more metal 
cofactors). The association may be covalent or non-covalent. Crystalline forms of this type are referred to 
herein as co-crystals. The compound may be any organic molecule, and it may modulate the function of a 
glycosyltransferase by for example inhibiting or enhancing its function, or it may be an acceptor, donor, or 
metal cofactor for the glycosyltransferase. It is preferred that the geometry of the compound and the 

35 interactions formed between the compound and the glycosyltransferase provide high affmity binding between 
the two molecules. 

The secondary or three-dimensional structures of the particular glycosyltransferases described herein 
provide useful models for the secondary or three-dimensional strucmres of glycosyltransferases from any 
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species, particularly mammalian, including bovine, ovine, porcine, murine, equine, preferably human, from any 
source whether natural, synthetic, semi-synthetic, or recombinant. 
Binding Sites - The GnT-1 active site 

AT-acetylglucosaminyltransferase I catalyses the addition of a P-1,2 GIcKAc onto the a- 1,3 arm of the 
ManjGlcNAcz N-linked carbohydrate moiety. The structure has allowed the identification of the binding she of 
UDP-GIcNAc, identification of the reaction centre, and the development of a workmg model of the 
Man5GlcNAc2-acceptor binding site that correlates with biochemical reaction inhibition evidence. 

The UDP-GlcNAc binding site can be subdivided into three sub sites: the uridine-ribose binding sub 
site, the phosphate-Mn^* binding sub site, and the GlcN Ac sub site. 

In the uridine-ribose sub site, there are three direct hydrogen bonds between the protein and the 
nucleotide sugar. Asp 144 interacts with the uridine N3, the His 190 NDl interacts with the uridine 02, and 
Asp212 binds the ribose 03. In addition, there is one water-mediated bond between Asp212 and the ribose 
02. Meanwhile, the uridine base makes van der Waals mteractions with lie 187. as well as the cysteine bridge 
between Cysl 15 and Cysl45. 

The phosphate-Mn** site is the subject of many interactions between the nucleotide sugar and protein; 
in fact, while the manganese co-ordination site lies on the enzyme's surface, a majority of the mteractions with 
the phosphates -come from a loop which structures itself on top of the substrate upon binding. 

The protein itself has only one direct co-ordination bond to the Mn^, via Asp213; since two of the six 
co-ordination points are taken up by the phosphate oxygens (one from each phosphate), the final three pomts 
are bound by water. These waters are then bound by the Thr3I5 OG, the Gly317 carbonyl oxygen, Glu21] 
and Asp213. 

The phosphate groups make one direct hydrogen bond to Arg3 1 7NH on the protein's rigid surface, 
while making three hydrogen bonds with the flexible loop which rigidifies into a lid on top of the phosphate- 
Mn^ subsite. These loop mteractions are with the Val321 backbone N and the backbone N and OG of Ser322. 
In addition, a two-water hydrogen-bonding bridge leads to Aspl 16. 

In contrast to the previous two sub sites, which hold the UDP-GIcNAc rigidly in place, the GIcNAc- 
binding sub site must allow the sugar ring enough flexibility to go through the flat penta-coordinate CI 
intennediate. Three direct hydrogen bonds are made: two between the GlcNAc 04 and Asp21 1 and Trp290, 
anchoring the 04-C1 axis of the GlcNAc in place, and one between the GlcNAc 03 and Asp21 1, establishing 
the conrect pucker for the sugar ring. One water bridge also exists between the sugar and the protein; the 
GlcNAc 06 hydrogen bonds to a water molecule held in place by the amide nitrogens of Phe289 and Trp290, 
along with the carbonyl oxygen of Tyrl84. 

The acetyl group methyl makes van der Waals contact with Leu269 and Leu331, leaving the acetyl 
group 07 and N2, along with the GlcNAc ring 05, unbonded. This lack of interaction may give the C2 and 
05 enough flexibility to make the movements necessary for the CI to achieve the reaction intermediate sn2 
conformation. 
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This experimental nucleotide-sugar binding conformation has allowed the identification of the base in 
the reaction: Asp291 is located just over 5 A from the GlcNAc CI. putting it in a perfect position to perfonn 
this role. 

The identification of the reaction centre and binding site has provided the constraints necessary to 
make a theoretical model of the Man5GlcNAc2-acceptor binding site. The a- 1,3 mannose 02 is placed 
between the Asp291 ODl and the GlcNAc CI, putting it into position for the nucleophilic attack on the CI. 
This positioning forces the conformation of the rest of the mannose: the 03 forms hydrogen bonds to Asp291, 
Arg295, and a structured water held in place by Arg415; the 04 hydrogen bonds to the same structured water 
as the 03; and the 06 hydrogen bonds to both a UDP-GlcNAc phosphate, as well as the OG of Ser322, a 
phosphate-binding lid loop residue. This a- 1,3 mannose orientation corresponds with biochemical evidence 
that all of the mannose ring's hydroxyl groups are important. In addition, this model supports the ordered- 
sequential reaction sequence, as the GlcNAc is buried below the MansGlcNAcj, as well as further evidence 
that the Man5GIcNAc2 binding site is partially formed upon GnT-l's UDP-GlcNAc bmding. 

The MansGlcNAc2 core maimose position is also constrauied by the reaction centre location: the 04 
hydrogen bonds to asp291, the ring is in van der Waals contact with Phe289 and Tyrl84, and die 06 hydrogen 
bonds to Asp292. Again, this position supports the known biochemistry, as the 04 is important, the 02 is 
unimportant, and the p Ol linkage is required to allow the ring to sit against the protein sur&ce. An a Ol 
would clash with the protein, and may break up the important lectin-like van der Waals interaction with the 
phenylalanine. 

Finally, the model allows the positioning of the a- 1,6 mannose, the a-1,3 and a- 1,6 inaimoses 
attached to it, and the chitobiosyl-core GlcNAca- The positions of these sugar rings m the model correspond 
with the location of conserved GnT-l surface residues; biochemical evidence states that these sugars are less 
important to Man5GIcNAc2 binding, and thus their position is less well defined than the a-1,3 arm and core 
mannose. 

In summary, the iV-acetylglucosaminyltransferase I structure has allowed the exact identification of 
the UDP-GlcNAc binding site, along with the reaction centre, and allowed the prediction of the ManjGlcNAcj- 
acceptor binding site. This UDP-GlcNAc-bound, closed-loop GnT-l structure is critical for the design of 
high-affmity inhibitors to the activity of GnT-l. 

Therefore, the invention contemplates a secondary or three-dimensional structure of a binding site of 
a glycosyltransferase. Binding sites include the binding site for a disphosphate group of a sugar nucleotide 
donor, a nucleotide of a sugar nucleotide donor, a nitrogeneous heterocyclic base (preferably a pyrimidme 
base, more preferably uracil) of a sugar nucleotide donor, a sugar of the nucleotide of a sugar nucleotide donor, 
a selected sugar of a sugar nucleotide donor that is transferred to an acceptor, and/or an acceptor. A three 
dimensional structure of a binding site may be defmcd by selected atomic contacts, preferably the enzyme 
atomic contacts as defined in Table 5. 

In an embodiment of the invention, a secondary or three-dimensional structure of a binding site of a 
glycosyltransferase that associates with a diphosphate of a sugar nucleotide donor (or the secondary or three- 
dimensional structure of a complex of the binding site with the diphosphate) is provided comprising at least 
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two or three atomic contacts of atomic interactions 8, 9, and 10 in Table 5, each atomic interaction defined 
therein by an atomic contact (more preferably, a specific atom where indicated) on the diphosphate group, and 
an atomic contact (more preferably, a specific amino acid residue where indicated) on the glycosyltransferase 
(i.e. enzyme atomic contact). The binding site may be defined by the enzyme atomic contacts of atomic 
interactions 8 and 9; 8 and 10; 9 and 10; or 8, 9, and 10 in Table 5. Preferably, the binding site is defined by 
the atoms of the enzyme atomic contacts having the structural coordinates for the atoms listed in Table I, 2, 3, 
or 4. 

In an embodiment of the invention, a secondary or three-dimensional structure of a binding site of a 
glycosyltransferase that associates with a heterocyclic amine base (preferably uracil) of a sugar nucleotide 
donor (or the secondary or three-dimensional structure of a complex of the binding site with a heterocyclic 
amine base) is provided comprising at least two, three, or four atomic contacts of atomic interactions 1, 2, 3, 4, 
and 5 in Table 5, each atomic interaction defined therein by an atomic contact (more preferably, a specific 
atom where indicated) on the heterocyclic amine base, and an atomic contact (more preferably, a specific 
amino acid residue where indicated) on the glycosyltransferase (i.e. enzyme atomic contact).. The bindmg site 
may be defined by the engine atomic contacts of atomic interactions 1, 2, and 3; 2, 3, and 4; 3, 4, and 5; 1,2. 
and 4; 1,2, and 5; 1, 3, and 4; 1, 3, and 5; 2. 3, and 5; 2. 4, and 5; 1, 2, 3, and 4; 1,2, 3, and 5; 2, 3, 4, and 5; 1, 
3, 4, and 5; or 1, 2, 3, 4, and 5 in Table 5. Preferably, the binding site is defined by the atoms of the enzyme 
atomic contacts having the structural coordinates for the atoms listed in Table 1 , 2, 3, or 4. 

In an embodiment of the invention, a secondary or three-dimensional structure of a binding cavity of a 
glycosyltransferase that associates with the sugar of the nucleotide (preferably ribose) of a sugar nucleotide 
donor (or a secondary or three-dimensional structure of a complex of the binding site with the sugar) is 
provided comprising the atomic contacts of atomic interactions 6 and 7 in Table 5, each atomic interaction 
defined therein by an atomic contact (more preferably, a specific atom where indicated) on the sugar, and an 
atomic contact (more preferably, a specific amino acid residue where indicated) on the glycosyltransferase (i.e. 
en2yme atomic contact). The binding site may be defined by the enzyme atomic contacts of atomic interactions 
6 and 7 in Table 5. Preferably, the binding site is defined by the atoms of the enzyme atomic contacts in the 
binding site having the structural coordinates for the atoms listed in Table 1, 2, 3, or 4. 

In an embodiment of the invention, a secondary or three-dimensional structure of a binding cavity of a 
glycosyltransferase that associates with a selected sugar (GlcNAc) of a sugar nucleotide donor (or a secondary 
or three-dimensional structure of a complex of the binding site with the selected sugar) is provided comprising 
at least two, three, four, five, six, seven, or eight atomic contacts selected fi-om the atomic contacts of atomic 
interactions 14, 15, 16, 17, 18, 19, 20, and 21 in Table 5, each atomic interaction defined therein by an atomic 
contact (more preferably, a specific atom where indicated) on the selected sugar, and an atomic contact (more 
preferably, a specific amino acid residue where indicated) on the glycosyltransferase (i.ie. enzyme atomic 
contact).. The binding site may be defined by the enzyme atomic contacts of atomic interactions 14, 18, and 
19; 14, 20, and 21; 14, 15, 16, and 17; 18, 19, 20, and 21; and 14 through 21 in Table 5. Preferably, the 
binding site is defined by the atoms of the enzyme atomic contacts in the binding site having the structural 
coordinates for the atoms listed in Table 1 , 2, 3, or 4. 
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In an embodiment of the invention, a secondary- or three-dimensional structure of a binding cavity of 
a glycosyltransferase that associates with a nucleotide (preferably UDP) of a sugar nucleotide donor (or a 
secondaiy or three-dimensional structure of a complex of the binding site and nucleotide) is provided 
comprismg at least two, three, four, five, six, seven, or eight, nine or ten atomic contacts of atomic interactions 
1, 2, 3. 4, 5, 6, 7, 8, 9, and 10 in Table 5, each atomic interaction defined therein by an atomic contact (more 
preferably, a specific atom where indicated) on the nucleotide, and an atomic contact (more preferably, a 
specific amino acid residue where indicated) on the glycosyltransferase (i.e. enzyme atomic contact). The 
binding site may be defined by enzyme atomic contacts of atomic interactions 1, 2, 6, 7, 8, 9, and 10; 3, 4, 6, 7, 
8, 9, and 10; and 1 through 10 in Table 5. Preferably, the binding site is defined by the atoms of the enzyme 
atomic contacts in the binding site having the structural coordinates for the atoms Usted in Table 1, 2, 3, or 4. 

In an embodiment of the invention, a secondary- or three-dimensional structure of a binding cavity of 
a glycosyltransferase that associates with a sugar nucleotide donor (e.g. UDP-GlcNAc) (or a secondaiy or 
three-dimensional structure of a complex of the binding site with the sugar nucleotide donor) is provided 
comprising at least two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, 
sixteen, seventeen, eighteen, nineteen, twenty, or twenty-one atomic contacts of atomic interactions 1, 2, 3. 4, 
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, and 21 in Table 5, each atomic interaction defined 
therein by an atomic contact (more preferably, a specific atom where indicated) on the sugar nucleotide donor, 
and an atomic contact (more preferably, a specific amino acid residue where indicated) on the 
glycosyltransferase (i.e. enzyme atomic contact). The binding site may be defined by enzyme atomic contacts 
of atomic interactions 1,2, 6, 7, 8, 9, 10, 14, 15, 18, and 20; 1, 2, 6, 7, 8, 9, 10, 14, 16, 17, 19, and 21; 3, 4, 6, 
7, 8, 9, 10, 14, 15, 18, and 20; 3, 4, 6, 7, 8, 9, 10, 14, 16, 17, 19, 21; or 1 through 21 listed in Table 5. 
Preferably the binding site is defined by the atoms of the enzyme atomic contacts in the binding site having the 
structural coordinates for the atoms listed in Table 1, 2, 3, or 4. 

A glycosyltransferase structure may be characterized by a "loop" structure. The loop folds on top of 
die pyrophosphate after the sugar nucleotide donor associates with the active site of the glycosyltransferase. 
Molecules that associate with the loop are highly specific inhibitors of the enzymes, in an embodiment of the 
invention, a secondaiy or three-dimensional structure of a loop structure of a glycosyltransferase that binds a 
pyrophosphate of a sugar nucleotide donor is provided comprising at least two, three, four, five, six, or seven 
atomic contacts of atomic mteractions 11, 12, 13, 23, 24, 25, and 27 in Table 5. The buiding site may be 
defmed by enzyme atomic contacts 11,12, and 13; 1 1, 12, 13 and 27; 23, 24, 25, and 27; or 1 1. 12, 13, 23, 24, 
25, and 27 in Table 5. Preferably, the binding site is defmed by the atoms of the enzyme atomic contacts in the 
binding site have the structural coordinates for the atoms listed in Table 1, 2, 3, or 4. 

A secondary or three-dimensional structure of a binding site of a glycosyltransferase that associates 
with an MansGlcNAcj-acceptor (or a secondary or three dimensional smicture of a complex of the binding site 
with the acceptor) is also provided comprising at least two, three, four, five, or six atomic contacts of atomic 
interactions 22, 23, 24, 25, 26, and 27 in Table 5, each atomic interaction defined therein by an atomic contact 
(more preferably, a specific atom where indicated) on the acceptor, and an atomic contact (more preferably, a 
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specific amino acid residue where indicated) on the glycosyltransferase (i.e. enzyme atomic contact). The 
binding site may be defined by enzyme atomic contacts of atomic interactions 22, 23, and 24; 23, 24, and 25; 
24, 25, and 26; 25. 26. and 27; 22, 23. 24, and 25; 23, 24. 25, and 26; 24. 25, 26, and 27; 22, 23. 24, 25, and 
26; 23, 24, 25, 26, and 27; and 22 through 27 in Table 5. Preferably, the binding site is defmed by the atoms of 
5 the en2^e atomic contacts in the binding site having the structural coordinates for the atoms listed in Table 1, 
2. 3. or 4. 

Method for Preparing Crystal Forms of a Glycosyltransferase 

The invention also features a method for creating crystalline glycosyltransferase structures described 
herein. The method may utilize a polypeptide comprising a glycosyltransferase described herein to form a 

10 crystal. A polypeptide used in the method may be chemically synthesized in whole or in part using techniques 
that are well-known in the art. Alternatively, methods are well known to the skilled artisan to construct 
expression vectors containing the native or mutated glycosyltransferase coding sequence and appropriate 
transcriptional/translational control signals. These methods include in vitro recombinant DNA techniques, 
synthetic techniques, and in vivo recombination/genetic recombination. See for example the techniques 

15 described in Sambrook et al. (Molecular Cloning: A Laboratory Manual. 2nd Edition, Cold Spring Harbor 
Laboratory press (1989)), and other laboratory textbooks. (See also Sarker et al, Glycoconjugate J. 7:380, 
1990; Sarker et al, Proc. Nad. Acad, Sci. USA 88:234-238, 1991. Sarker et al, Glycoconjugate J. 1 1: 204-209. 
1994; Hull et al, Biochem Biophys Res Commun 176:608, 1991 and Pownall et al, Genomics 12:699-704, 
1992). 

20 Crystals are grown from an aqueous solution containing the purified glycosyltransferase polypeptide 

by a variety of conventional processes. These processes include batch, liquid, bridge, dialysis, vapor diffusion, 
and hanging drop methods. (See for example, McPherson, 1982 John Wiley, New York; McPherson, 1990, 
Eur. J. Biochem. 189: 1-23; Webber. 1991, Adv. Protein Chem. 41:1-36). Generally, the native crystals of the 
invention are grown by adding precipitants to the concentrated solution of the glycosyltransferase polypeptide. 

25 The precipitants are added at a concentration just below that necessary to precipitate the protein. Water is 
removed by controlled evaporation to produce precipitating conditions, which are maintained until crystal 
growth ceases. 

In an embodiment of the invention, the method comprises mixing a volume of a glycosyltransferase 
solution (e.g. 5 mg glycosyltransferase /ml to 15 mg glycosyltransferase /ml, preferably 10 mg/ml) with a 
30 reservoir solution; and equilibrating against the reservoir solution under vapour-difilision conditions. 

It will be appreciated that the crystallization conditions can be varied and such variations can be used 
alone or in combination. 

Derivative crystals of the invention can be obtained by soaking native crystals in a solution containing 
salts of heavy metal atoms. A complex of the invention can be obtained by soaking a native crystal in a 
35 solution containing a compound that binds the glycosyltransferase, or they can be obtained by co-crystallizing 
the glycosyltransferase polypeptide in the presence of one or more compounds that bind to the 
glycosyltransferase. 
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Once the crystal is grown it can be placed in a glass capiliaiy tube and mounted onto a holding device 
connected to an X-ray generator and an X-ray detection device. Collection of X-ray dif&action patterns are 
well documented by those skilled in the art (See for exanq>le, Ducruix and Geige, 1992, IRL Press, Oxford, 
England). A beam of X-rays enter the crystal and diffract from the crystal. An X-ray detection device can be 
5 utilized to record the diffraction patterns emanating from the crystal. Suitable devices include the Marr 345 
imaging plate detector system with an RU200 rotating anode generator. 

Methods for obtaining the three dimensional structure of the crystalline form of a molecule or 
complex are described herein and known to those skilled in the art (see Ducruix and Geige). Generally, the x- 
ray crystal structure is given by the diffraction patterns. Each diffraction pattern reflection is characterized as a 

10 vector and the data collected at this stage determines the amplitude of each vector. The phases of the vectors 
may be determined by the isomorphous replacement method where heavy atoms soaked into the crystal are 
used as reference points in the X-ray analysis (see for example, Otwinowski, 1991, Daresbury, United 
Kingdom, 80-86). The phases of the vectors may also be determined by molecular replacement (see for 
example, Naraza, 1994, Proteins 1 1:281-296). The amplitudes and phases of vectors from the crystalline form 

15 of a glycosyltransferase, e.g. an N-acetylglucosaminyltransferase I, detennined in accordance with these 
methods can be used to analyze other crystalline glycosyltransferases, particularly those with an SGC domain. 

The unit cell dimensions and symmetry, and vector amplitude and phase information can be used in a 
Fourier transform fimction to calculate the electron density in the unit cell i.e. to generate an experimental 
electron density map. This may be accomplished using the PHASES package (Furey, 1990). Amino acid 

20 sequence structures are fit to the experimental electron density map (i.e. model building) using computer 
programs (e.g. Jones, TA. et al. Acta Crystallogr A47, 100-119, 1991). This structm*e can also be used to 
calculate a theoretical electron density map. The theoretical and experimental electron density maps can be 
compared and the agreement between the maps can be described by a parameter referred to as R-factor. A high 
degree of overlap in the maps is represented by a low value R-factor. The R-factor can be minimized by using 

25 computer programs that refine the structure to achieve agreement between the theoretical and observed 
electron density map. For example, the XPLOR program, developed by Brunger (1992, Nature 355:472-475) 
can be used for model refmement. 

A three dimensional structure of the molecule or complex may be described by atoms that fit the 
theoretical electron density characterized by a minimum R value. Files can be created for the structure that 

30 defines each atom by coordinates in three dimensions. 
Identification of Homologues 

The knowledge of a glycosyltransferase structure of the invention enables one skilled in the art to 
identify homologues of glycosyltransferases. This is achieved by searches of three-dimensional databases. 
Since structural folds are conserved to a greater extent than sequence, one may identify homologues with very 

35 little sequence identity or similarity. Programs that provide this type of database searching are known in the art 
and include Dali. The structural coordinates of a protein structure are submitted and the program performs a 
multiple structural alignment with proteins in the protein data bank. Homologues identified in accordance with 
the present invention may be used in the methods of the invention described herein. 
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Methods for Determining Secondary or Three Dimensional Structures 

The structure coordinates of a glycosyltransferase structure described herein can be used as a model 
for determining the secondary or three-dimensional structures of additional native or mutated 
glycosyttransferases with unknown structure, as well as the structures of co-crystals of glycosyltransferases 
5 with compounds such as acceptors, donors (e.g. UDP-GlcNAc or analogues thereof), and modulators (e.g. 
stimulators or inhibitors). The structure coordinates and models of a glycosyltransferase structure can also be 
used to determine solution-based structures of native or mutant glycosyltransferases. 

Secondary or three-dimensional structure may be determined by applying the structural coordinates of 
a glycosyltransferase structure to other data such as an amino acid sequence. X-ray crystallographic dif&action 

10 data, or nuclear magnetic resonance (NMR) data. Homology modeling, molecular replacement, and nuclear 
magnetic resonance methods using these other data sets are described below. 

Homology modeling (also known as comparative modeling or knowledge-based modeling) methods 
develop a three dimensional model from a polypeptide sequence based on the structures of known proteins 
(e.g. native or mutated glycosyltransferases). In the present invention the method utilizes a computer 

15 representation of a glycosyltransferase strucmre, preferably a three dimensional structure of an N- 
acetylglucosaminyltransferase I, or a complex of same, a computer representation of the amino acid sequence 
of a polypeptide with an unknown structure (additional native or mutated glycosyltransferases, or polypeptides 
comprising an SGC domain), and standard computer representations of the structures of amino acids. The 
method in particular comprises the steps of; (a) identifying structurally conserved and variable regions in the 

20 known structure; (b) aligning the amino acid sequences of the known structure and unknown structure (c) 
generating coordinates of main chain atoms and side chain atoms in structurally conserved and variable regions 
of the unknown structure based on the coordinates of the known structure thereby obtaining a homology 
model; and (d) refining the homology model to obtain a three dimensional structure for the unknown structure. 
This method is well known to those skilled in the art (Greer, 1985, Science 228, 1055; Bundell et al 1988, Eur. 

25 J. Biochem. 172, 513; Knighton et al., 1992. Science 258:130-135, 
http://biochem.vt.edu/courses/modeling/homology.htn). Computer programs that can be used in homology 
modeling are Quanta and the Homology module in the Insight II modeling package distribmed by Molecular 
Simulations Inc, or MODELLER (Rockefeller University, www.iucr.ac.uk/sinris-top/Iogical/prg- 
modellenhtml). 

30 In step (a) of the homology modeling method, the known glycosyltransferase structure (e.g. structure 

of the N-acetylglucosaminyltransferase I) is exammed to identify the structurally conserved regions (SCRs) 
from which an average structure, or framework, can be constructed for these regions of the protein. Variable 
regions (VRs), in which known structures may differ in conformation, also must be identified. SCRs generally 
correspond to the elements of secondary structure, such as alpha-helices and beta-sheets, and to ligand- and 

35 substrate-binding sites (e.g. acceptor and donor binding sites). The VRs usually lie on the surface of the 
proteins and form the loops where the main chain turns. 

Many methods are available for sequence alignment of known structures and unknown structures. 
Sequence alignments generally are based on the dynamic programming algorithm of Needleman and Wunsch 
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[J. Mol. Biol. 48: 442-453, 1970]. Current methods include FASTA, Smith-Waterman, and BLASTP, with the 
BLASTP method differing from the other two in not allowing gaps. Scoring of alignments typically involves 
construction of a 20x20 matrix in ^ich identical amino acids and those of similar character (i.e., conservative 
substitutions) may be scored higher than those of different character. Substitution schemes which may be used 
5 to score alignments include the scoring matrices PAM (Dayhoff et al., Meth. Enzymol. 91: 524-545, 1983), 
and BLOSUM (Henikoff and Henikoff, Proc. Nat. Acad. Sci. USA 89: 10915-^0919, 1992), and the matrices 
based on alignments derived from three-dimensional structures including that of Johnson and Overington (JO 
matrices) (J. Mol. Biol. 233: 716-738, 1993). 

Alignment based solely on sequence may be used; however, other structural features also may be 

10 taken into account. In Quanta, multiple sequence alignment algorithms are available that may be used when 
aligning a sequence of the unknown with the known structures. Four scoring systems (i.e. sequence homology, 
secondary structure homology, residue accessibility homology, CA-CA distance homology) are available, each 
of which may be evaluated during an alignment so that relative statistical weights may be assigned. 

When generating coordinates for the unknown structure, main chain atoms and side chain atoms, both 

1 5 in SCRs and VRs need to be modeled. A variety of approaches known to those skilled in the art may be used to 
assign coordinates to the unknown. In particular, the coordinates of the main chain atoms of SCRs will be 
transferred to the unknown structure. VRs correspond most often to the loops on the surface of the polypeptide 
and if a loop in the known structure is a good model for the unknown, then the main chain coordinates of the 
known structure may be copied. Side chain coordinates of SCRs and VRs are copied if the residue type in the 

20 unknown is identical to or very similar to that in the known structure. For other side chain coordinates, a side 
chain rotamer library may be used to define the side chain coordinates. When a good model for a loop caimot 
be found fragment databases may be searched for loops in other proteins that may provide a suitable model for 
the unknown. If desired, the loop may then be subjected to conformational searching to identify low energy 
conformers if desired. 

25 Once a homology model has been generated it is analyzed to determine its correcmess. A computer 

program available to assist m this analysis is the Protein Health module in Quanta which provides a variety of 
tests. Other programs that provide structure analysis along with output include PROCHECK and 3D-Profiler 
[Luthy R. et al. Nature 356: 83-85, 1992; and Bowie, J.U. et al. Science 253: 164-170, 1991]. Once any 
irregularities have been resolved, the entire structure may be further refined. Refinement may consist of energy 

30 minimization with restramts, especially for the SCRs. Restraints may be gradually removed for subsequent 
minimizations. Molecular dynamics may also be applied in conjunction with energy minimization. 

Molecular replacement involves applying a known structure to solve the X-ray crystallographic data 
set of a polypeptide of unknown structure (e.g. native or mutated glycosyltransferases). The method can be 
used to define the phases describing the X-ray diffraction data of a polypeptide of unknown structure when 

35 only the amplitudes are known. Commonly used computer software packages for molecular replacement are X- 
PLOR (Brunger 1992, Nature 355: 472-475), AMoRE (Navaza, 1994, Acta Crystallogr. A50:157.163). the 
CCP4 package (Collaborative Computational Project, Number 4, "The CCP4 Suite: Programs for Protein 
Crystallography", Acta Cryst., Vol. D50, pp. 760-763, 1994), and the MERLOT package (P.M.D. Fitzgerald, 
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J. Appl. Cryst., Vol. 21, pp. 273-278, 1988). It is preferable that the resulting structure not exhibit a root-mean- 
square deviation of more than 3 A. 

Molecular replacement computer programs generally involve the following steps: (1) determining 
the number of molecules in the unit cell and defining the angles between diem (self rotation function); (2) 
S rotating the known structure (e.g. glycosyltransferase) against dif&action data to define the orientation of the 
molecules in the unit cell (rotation function); (3) translating the known structure in three dimensions to 
correctly position the molecules in the unit cell (translation function); (4) determining the phases of the X-ray 
diffi^ction data and calculating an R-factor calculated from the reference data set and from the new data 
wherein an R-factor between 30-50% indicates that the orientations of the atoms in the unit cell have been 

10 reasonably determined by the method; and (5) optionally, decreasing the R-factor to about 20% by refining the 
new electron density map using iterative refinement techniques known to those skilled in the art (refinement). 

In an embodiment of the invention, a method is provided for determining three dimensional structures 
of polypeptides with unknown structure (e.g. additional native or mutated glycosyltransferases) by applying the 
structural coordinates of a glycosyltransferase structure to provide an X-ray crystallographic data set for a 

1 5 polypeptide of unknown structure, and (b) determining a low energy conformation of the resulting structure. 

The structural coordinates of a glycosyltransferase structure may be sqsplied to nuclear magnetic 
resonance (NMR) data to detennine the three dimensional structures of polypeptides (e.g. additional native or 
mutated glycosyltransferases, or polypeptides comprising an SGC domain). (See for example, Wuthrich, 1986, 
John Wiley and Sons, New York: 176-199; Pflugrath et al., 1986, J. Molecular Biology 189: 383-386; Kline et 

20 al., 1986 J. Molecular Biology 189:377-382). While the secondary structure of a polypeptide may often be 
determined by NMR data, the spatial connections between individual pieces of secondary structure are not as 
readily detennined. The structural coordinates of a polypeptide defined by X-ray crystallography can guide the 
NMR spectroscopist to an understanding of the spatial interactions between secondEuy structural elements in a 
polypeptide of related structure, Infomiation on spatial interactions between secondary structural elements can 

25 greatly simplify Nuclear Overhauser Effect (NOE) data from two-dimensional NMR experiments. In addition, 
applying the structural coordinates after the determination of secondary structure by NMR techniques 
simplifies the assignment of NOE's relating to particular amino acids in the polypeptide sequence and does not 
greatly bias the NMR analysis of polypeptide structure. 

In an embodiment, the invention relates to a method of determining three dimensional structures of 

30 polypeptides with imknown structures, preferably a native or mutated glycosyltransferases or polypeptides 
comprising an SGC domain, by applying the structural coordinates of a glycosyltransferase structure of the 
invention to nuclear magnetic resonance (NMR) data of the unknown structure. This method comprises the 
steps of: (a) determining the secondary structure of an unknown structiu*e using NMR data; and (b) simplifying 
the assignment of through-space interactions of amino acids. The term " through-space interactions** defines 

35 the orientation of the secondary structural elements in the three dimensional structure and the distances 
between amino acids from different portions of the amino acid sequence. The term "assignment" defines a 
method of analyzing NMR data and identifying which amino acids give rise to signals in the NMR spectrum. 
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Identification of Modulators of Glycosyltransferases 

Modulators (e.g. inhibitors) of a glycosyitransferase (or a binding site or domain thereof) may be 
designed and identified that may modify the inappropriate activity of a glycosyitransferase involved in a 
clinical disorder. The rational design and identification of modulators of glycosyltransferases can be 
accomplished by utilizing the atomic structural coordinates that define a glycosyitransferase structure, or a part 
thereof Structure-based modulator design identification methods are powerful techniques that can involve 
searches of computer data bases containing a variety of potential modulators and chemical functional groups. 
(See Kuntz et al., 1994, Acc. Chem. Res. 27: U 7; Guida, 1994, Current Opinion in Struc. Biol. 4: 777; and 
Cohnan, 1994, Current Opinion in Struc. Biol. 4: 868, for reviews of structure-based drug design and 
identification;and Kuntz et al 1982, J. Mol. Biol, 162:269; Kuntz et al„ 1994, Acc. Chem. Res. 27: 117; Meng 
et al., 1992, J. Compt. Chem. 13: 505; Bohm, 1994, J. Comp. Aided Molec. Design 8: 623 for methods of 
structure-based modulator design). 

The glycosyitransferase structures, and parts thereof described herein, and the structures of other 
polypeptides determined by the homology modeling, molecular replacement, and NMR techniques described 
herein can also be applied to modulator design and identification methods. 

Modulators of glycosyltransferases may be identified by docking the computer representation of 
compounds from a data base of molecules. Data bases which may be used include ACD (Molecular Designs 
Limited), NCI (National Cancer Institute), CCDC (Cambridge Crystallographic Data Center), CAST 
(Chemical Abstract Service), Derwent (Derwent Information Limited), Maybridge (Maybridge Chemical 
Company Ltd), Aldrich (Aldrich Chemical Company), DOCK (University of California in San Francisco), and 
the Directory of Natural Products (Chapman & Hall). Computer programs such as CONCORD (Tripos 
Associates) or DB-Converter (Molecular Simulations Limited) can be used to convert a data set represented in 
two dimensions to one represented in three dimensions. 

The computer programs may comprise die following steps: 

(a) docking a computer representation of a structure of a compound into a computer representation 
of an active-site (e.g. binding site or SGC domain) of a glycosyitransferase defined in accordance 
with the invention using the computer program, or by interactively moving the representation of 
the compound into the representation of the active-site; 

(b) characterizing the geometry and the complementary interactions formed between the atoms of the 
active-site and the compound; optionally 

(c) searching libraries for molecular fragments which can fit into the empty space between die 
compound and active site and can be linked to the compound; and 

(d) linking the fragments found in (c) to the compound and evaluating the new modified compound. 
Methods are also provided for identifying a potential modulator of a glycosyitransferase function by 

docking a computer representation of a compound with a computer representation of a structure of a 
glycosyitransferase that is defined by the binding sites, atomic interactions, atomic contacts, or atomic 
structural coordinates described herein. In an embodiment the method comprises the following steps: 
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(a) docking a computer representation of a compound from a computer data base with a 
computer representation of a selected site (e.g. the sugar nucleotide donor or acceptor 
binding site, or SGC domain) on a glycosyltransferase structure defmed in accordance with 
the invention to obtain a complex; 
5 (b) determining a conformation of the complex with a favourable geometric fit and favourable 

complementary interactions; and 

(c) identifying compounds that best fit the selected site as potential modulators of the 
glycosyltransferase. 

"Docking" refers to a process of placing a compound in close proximity with an active site of a 

10 polypeptide (i.e. a glycosyltransferase), or a process of finding low energy conformations of a 
compound/polypeptide complex (i.e. compound/glycosyltransferase complex). 

Examples of other computer programs that may be used for structure-based modulator design are 
CAVEAT (Bartlett et al., 1989, in "Chemical and Biological Problems in Molecular Recognition**, Roberts, 
S.M. Ley, S.V.; Campbell, N.M. eds; Royal Society of Chemistry: Cambridge, pp 182-196); FLOG (Miller et 

15 al., 1994, J. Comp. Aided Molec. Design 8:153); PRO Modulator (Clark et al., 1995 J. Comp. Aided Molec. 
Design 9:13); MCSS (Miranker and Karplus, 1991, Proteins: Structure, Fuction, and Genetics 8:195); and, 
GRID (Goodford, 1985, J. Med. Chem. 28:849). 

In an embodiment of the invention, a method is provided for identifying potential modulators of 
glycosyltransferase f\inction. The method utilizes the structural coordinates of a glycosyltransferase three 

20 dimensional structure, or binding site or domain thereof The method comprises the steps of (a) generating a 
computer representation of a glycosyltransferase structure, preferably an N-acetylglucosaminyltransferase I 
structure, and docking a computer representation of a compound from a computer data base with a computer 
representation of an active site (e.g. sugar nucleotide donor or acceptor binding site) of the glycosyltransferase 
to form a complex; (b) determining a conformation of the complex with a favourable geometric fit or favorable 

25 complementary interactions; and (c) identifying compounds that best fit the glycosyltransferase active-site as 
potential modulators of glycosyltransferase function. The initial glycosyltransferase structure may or may not 
have compounds bound to it. A favourable geometric fit occurs when the sur&ce areas of a compound in a 
compound-glycosyltransferase complex is in close proximity with the surface area of the active-site of the 
glycosyltransferase without forming unfavorable interactions. A favourable complementary interaction occurs 

30 where a compound in a compound-glycosyltransferase complex interacts by hydrophobic, arx>matic, ionic, or 
hydrogen donating and accepting forces, with the active-site of a glycosyltransferase without forming 
unfavorable interactions. Unfavourable interactions may be steric hindrance between atoms in the compound 
and atoms in the glycosyltransferase active-site. 

In another embodiment, potential modulators are identified utilizing a glycosyltransferase structure 

35 with or without compounds bound to it. The method comprises the steps of (a) modifying a computer 
representation of a glycosyltransferase (e.g. an N-acetylglucosaminyltransferase 1) having one or more 
compounds bound to it, where the computer representations of the compound or compounds and 
glycosyltransferase are defined by atomic structural coordinates; (b) determining a conformation of the 
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complex with a favorable geometric fit and &vorable complementary interactions; and (c) identifying the 
compounds that best fit the glycosyltransferase active site as potential modulators. A computer representation 
may be modified by deleting or adding a chemical group or groups. Computer representations of the chemical 
groups can be selected fi'om a computer database. 
5 Anodier way of identifying potential modulators is to modify an existing modulator in a polypeptide 

active-site. The computer representation of modulators can be modified within the computer representation of 
a glycosyltransferase active-site. This technique is described in detail in Molecular Simulations User Manual, 
1995 in LUDI. The computer representation of a modulator may be modified by deleting a chemical group or 
groups, or by adding a chemical group or groups. After each modification to a compound, the atoms of the 

10 modified compound and active-site can be shifted in conformation and the distance between the modulator and 
the active site atoms may be scored on the basis of geometric fit and favourable complementary interactions 
between the molecules. Compounds with favourable scores are potential modulators. 

Compounds designed by modulator building or modulator searching computer programs may be 
screened to identify potential modulators. Examples of such computer programs include programs in the 

15 Molecular Simulations Package (Catafyst), ISIS/HOST, ISIS/BASE, and ISIS/DRAW (Molecular Designs 
Limited), and UNITY (Tripos Associates). A building program may be used to replace computer 
representations of chemical groups in a compound compiexed with a glycosyltransferase with groups from a 
computer data base. A searching program may be used to search computer representations of compounds from 
a computer database that have similar three dimensional structures and similar chemical groups as a compound 

20 that binds to a glycosyltransferase. The programs may be operated on the structure of the active-site (e.g. 
binding sites, or SGC domain) of a glycosyltransferase structure, preferably an N- 
acetylglucosaminyltransferase 1. 

A typical program may comprise the following steps: 

(a) mapping chemical features of a compound such as by hydrogen bond donors or acceptors, 
25 hydrophobic/lipophilic sites, positively ionizable sites, or negatively ionizable sites; 

(b) adding geometric constraints to selected mapped features; 

(c) searching data bases with the model generated in (b). 

In an embodiment of the invention a method of identifying potential modulators of a 
glycosyltransferase, preferably an N-acetylglucosaminyltransferase I, is provided using the three dimensional 

30 conformation of the glycosyltransferase in various modulator construction or modulator searching computer 
programs on compounds compiexed with the glycosyltransferase. The method comprises the steps of (a) 
generating a computer representation of one or more compoimds compiexed with a glycosyltransferase; (b) (i) 
searching a data base for a compound with a similar geometric structure or s'unilar chemical groups to the 
generated compounds using a computer program that searches computer representations of compounds from a 

35 database that have similar three dimensional structures and similar chemical groups, or (ii) replacing portions 
of the compounds compiexed with the glycosyltransferase with similar chemical structures (i.e. nearly identical 
shape and volume) from a database using a compound construction computer program that replaces computer 
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representations of chemica] groups with groups from a computer database, where the representations of the 
compounds are defined by structural coordinates. 

A compound that interacts with a glycosyitransferase or selected binding sites or domains thereof 



composition bearing the interacting binding site or domains. Therefore, the invention features a modulator of a 
glycosyitransferase identified by a method of the mvention. 



glycosyitransferase comprising the step of using the structural coordinates of a sugar nucleotide donor or 
acceptor or component thereof, or an acceptor or components thereof, defined in relation to its spatial 
association with a glycosyitransferase structure or a binding site or domain thereof, to generate a compound 
that is capable of associating with the glycosyitransferase or binding site or domain thereof. 

In an embodiment of the invention, a method is provided for designing potential inhibitors of a 
glycosyitransferase comprising the step of using the structural coordinates of uridine, uracil, or UDP listed in 
Table 3 [ATOMS 2828-2835 (uracil); 2836-2844 (ribose); and 2845-2851 (diphosphate)] to generate a 
compound for associating with the active site of a glycosyitransferase. The following steps are employed in a 
particular method of the invention: (a) generating a computer representation of uridine, uracil, or UDP, defined 
by its structural coordinates listed in Table 3; (b) searching for molecules in a data base that are structurally or 
chemically similar to the defined uridine, uracil, or UDP, using a searching computer program, or replacing 
portions of the compound with similar chemical structures fix>m a database using a compound building 
computer program. 

In another embodiment of the invention, a method is provided for designing potential inhibitors of a 
glycosyitransferase comprising the step of using the structural coordinates of UDP-GlcNAc listed in Table 3 
(ATOMS 2813-2851), to generate a compound for associating with the active site of a glycosyitransferase. 
The following steps are employed in a particular method of the invention: (a) generating a computer 
representation of UDP-GlcNAc defined by its structural coordinates listed in Table 3; and (b) searching for 
molecules in a data base that are structurally or chemically similar to the defined UDP-GlcNAc using a 
searching computer program, or replacing portions of the compound with similar chemical structures fi'om a 
database using a compound building computer program. 

In another embodnnent of the mvention, a method is provided for designing potential inhibitors of a 
glycosyitransferase comprising the step of using the structural coordinates of a MansGlcNAc2 acceptor listed in 
Table 4, to generate a compound for associating with the active site of a glycosyitransferase. In Table 4, the 
coordinates of a Man5GlcNAc2 acceptor are listed as ATOMS 3043 through 3126 where the mannose and 
GlcNAc residues designated as X, Y, U, V, W, Z, and A have the following positions in the acceptor : 



identified using a method of the invention may be used as a modulator of any glycosyitransferase or 



The invention fiirther contemplates a method for designing potential inhibitors of a 



Manal-6(U) 





Manal-6(W) 




Manal-3 (X) 
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The following steps are employed in a particular method of the invention: (a) generating a computer 
representation of a Man5GIcNAc2 acceptor defmed by its structural coordinates listed in Table 4; and (b) 
searching for molecules in a data base that are structurally or chemically similar to the defined Man^GlcNAcz 
acceptor using a searching computer program, or replacing portions of the compound with similar chemical 
S structures from a database using a compound building computer program. 

It will be appreciated that a modulator of a glycosyltransferase may be identified by generating an 
actual three-dimensional model of a binding cavity, synthesizing a compound, and examining the components 
to find whether the required interaction occurs. 

Potential modulators of glycosyltransferases identified using the above-described methods may be 

10 prepared using methods described in standard reference sources utilized by those skilled in the art. For 
example, organic compounds may be prepared by organic synthetic methods described in references such as 
March, 1994, Advanced Organic Chemistry: Reactions, Mechanisms, and Structure, New York, McGraw Hill. 

The invention also relates to a potential modulator identified by the methods of the invention. In 
particular, classes of modulators of glycosyltransferases are provided that are based on the three-dimensional 

15 structure of a sugar nucleotide donor, or component thereof, or acceptor, defined in relation to the sugar 
nucleotide donor's or acceptor's spatial association with a glycosyltransferase structure. Modulators of 
glycosyltransferases comprise a compound comprising the structure of uracil, uridine, ribose, pyrophosphate, 
or UDP, and having one or more, preferably all, of the structural coordinates of uracil, uridine, ribose, 
pyrophosphate, or UDP of Table 3 [ATOMS 2828-2835 (uracil); 2836-2844 (ribose); and 2845-2851 

20 (diphosphate)]. In an embodiment, modulators are provided comprising the structure of UDP-GIcNAc and 
having one or more, preferabty all, of the structural coordinates of UDP-GlcNAc of Table 3 (ATOMS 2813- 
2851). Functional groups in the uracil, uridine, ribose, pyrophosphate, UDP, orUDP-GlcNAc modulators may 
be substituted with, for example, alkyi, alkoxy, hydroxyl, aryl, cycloalkyl, alkenyl, alkynyl, thiol, thioalkyl, 
thioaryl, amino, or halo, or they may be modified using techniques known in the art. 

25 Modulators are also contemplated that comprise the structure of a MansGIcNAci acceptor for a 

glycosyltransferase with the structural coordmates of Man5GlcNAc2 acceptor listed in Table 4 (ATOMS 3043 
through 3126). Functional groups in an acceptor structure may be substituted with, for example, alkyl, alkoxy, 
hydroxyl, aryl, cycloalkyl, alkenyl, alkynyl, thiol, thioalkyl, thioaiyl, amino, or halo, or they may be modified 
using techniques known in the art. 

30 The invention contemplates all optical isomers and racemic forms of the modulators of the invention. 

Compositions and Methods of Treatment 

The modulators of the invention may be used to modulate the biological activity of a 
glycosyltransferase in a cell, including modulating a pathway in a cell regulated by the glycosyltransferase or 
modulatmg a glycosyltransferase with inappropriate activity in a cellular organism. In addition, a 

35 glycosyltransferase structure of the invention may be used to devise protocols to modulate the biological 
activity of a glycosyltransferase in a cell. 

Cellular assays, as well as animal model assays in v/vo, may be used to test the activity of a potential 
modulator of a glycosyltransferase as well as diagnose a disease associated with inappropriate 
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glycosyitransferase activity. In vivo assays are also useful for testing the bioactivity of a potential modulator 
designed by the methods of the invention. 

The modulators (e.g. inhibitors) identified using the methods of the invention may be useful in the 
treatment and prophylaxis of tumor growth and metastasis of tumors. Anti-metastadc effects of inhibitors can 
S be demonstrated usmg a lung colonization assay. For example, melanoma cells treated with an inhibitor may 
be injected into mice and the ability of the melanoma cells to colonize the lungs of the mice may be examined 
by counting tumor nodules on the lungs after death. Suppression of tumor growth in mice by the inhibitor 
administered orally or intravenously may be examined by measuring tumor volume. 

An inhibitor identified using the invention may have particular application in the prevention of tumor 

10 recurrence after surgery i.e. as an adjuvant therapy. 

An inhibitor may be especially useful in the treatment of various forms of neoplasia such as 
leukemias, lymphomas, melanomas, adenomas, sarcomas, and carcinomas of solid tissues in patients. In 
particular, inhibitors can be used for treating malignant melanoma, pancreatic cancer, cervico-uterine cancer, 
ovarian cancer, cancer of the kidney such as metastatic renal cell carcinoma, stomach, lung, rectum, breast, 

IS bowel, gastric, liver, thyroid, head and neck cancers such as unresectable head and neck cancers, lymphangitis 
carcinamatosis, cancers of the cervix, breast, salivary gland, leg, tongue, lip, bile duct, pelvis, mediastinum, 
urethra, bronchogenic, bladder, esophagus and colon, non-small cell lung cancer, and Kaposi's Sarcoma which 
is a form of cancer associated with HIV-infected patients with Acquued Immune E>eficiency Syndrome 
(AIDS). The inhibitors may also be used for other anti-proliferative conditions such as bacterial and viral 

20 infections, in particular AIDS. 

An inhibitor identified in accordance with the present invention may be used to treat 
immunocompromised subjects. For example, they may be used in a subject infected with HIV, or other viruses 
or infectious agents including bacteria, fungi, and parasites, in a subject undergoing bone marrow transplants, 
and in subjects with chemical or tumor-induced immune suppression. 

25 Inhibitors may be used as hemorestorative agents and in particular to stimulate bone marrow cell 

proliferation, in particular following chemotherapy or radiotherapy. The myeloproliferative activity of an 
inhibitor of the invention may be determined by injecting the inhibitor into mice, sacrificing the mice, 
removmg bone marrow cells and measuring the ability of the inhibitor to stimulate bone marrow proliferation 
by directly counting bone marrow cells and by measuring clonogenic progenitor cells in methylcellulose 

30 assays. The inhibitors can also be used as chemoprotectants, and in particular to protect mucosal epithelium 
following chemotherapy. 

An inhibitor identified in accordance with the invention also may be used as an antiviral agent in 
particular on membrane enveloped viruses such as retroviruses, influenza viruses, cytomegaloviruses and 
herpes viruses. An inhibitor may also be used to treat bacterial, fungal, and parasitic mfections. For example, a 
35 small molecule inhibitor can be used to prevent or treat infections caused by the following: Neisseria species 
such as Neisseria meningitidis, and N. gonorrheae; Chlamydia species such as Chlamydia pneumoniae. 
Chlamydia psittaci^ Chlamydia trichomatis; Escherichia coli, Haemophilus species such as Haemophilus 
influenza; Yersinia enterocolitica; Salmonella species such as S.typhimurium\ Shigella species such as Shigella 
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flexneri'y Streptococcus species such as S.agalactiae and S, pneumoniae; Bacilllus species such as Bacillus 
subtilis; Branhamella catarrhaiis; Borrelia burgdorfen Pseudomonas aeruginosa^ Coxiella burnetii; 
Campylobacter species such as C.hyoilei; Helicobacter pylori; and, Klebsiella species such as Klebsiella 
pneumoniae. 

5 An inhibitor may also be used in the treatment of inflammatory diseases such as rheumatoid arthritis, 

asthma, inflammatory bowel disease, and atherosclerosis. 

An inhibitor may also be used to augment the anti-cancer effects of agents such as interleukin-2 and 
poly-IC, to augment natural killer and macrophage tumoricidal activity, induce cytokine synthesis and 
secretion, enhance expression of LAK and HLA class I specific antigens; activate protein kinase C, stimulate 

10 bone marrow cell proliferation including hematopoietic progenitor cell proliferation, and increase engrafbnent 
efficiency and colony-forming unit activity, to confer protection against chemotherapy and radiation therapy 
(e.g. chemoprotective and radioprotective agents), and to accelerate recovery of bone marrow cellularity 
particularly when used in combination with chemical agents commonly used in the treatment of human diseases 
including cancer and acquired immune deficiency syndrome (AIDS). For example, an inhibitor can be used as 

15 a chemoprotectant in combination with anti-cancer agents including doxorubicin, 5-fluorouracil, 
cyclophosphamide, and metiiotrexate, and in combination with isoniazid or NSAID. 

The present invention thus provides a method for treating the above-mentioned conditions in a subject 
comprising administering to a subject an effective amount of a modulator of the invention. The invention also 
contemplates a method for stimulating or inhibiting tumor growth or metastasis in a subject comprising 

20 administering to a subject an effective amount of a modulator of the invention. 

The invention still further relates to a pharmaceutical composition which comprises a 
glycosyltransferase structure of the invention or a part thereof (e.g. an active site, a phosphate-binding loop lid, 
an SGC domain, DxD motif,), or a modulator of the invention in an amount effective to regulate one or more 
of the above-mentioned conditions (e.g. tumor growth or metastasis) and a pharmaceutically acceptable cairier, 

25 diluent or excipient. 

The compositions of the invention are administered to subjects in a biologically compatible form 
suitable for pharmaceutical administration in vivo. By "biologically compatible form suitable for 
administration in vivo" is meant a form of the active ingredient to be administered in which any toxic effects 
are outweighed by the therapeutic effects of the active ingredient. The term "subject" is intended to include 

30 mammals and includes humans, dogs, cats, mice, rats, and transgenic species thereof. Admmistration of a 
therapeutically active amount of the phannaceutical compositions of the present invention is defmed as an 
amount effective, at dosages and for periods of time necessary to achieve the desired resuk. For example, a 
therapeutically active amount of a modulator of the invention may vary according to factors such as the 
condition, age, sex, and weight of the individual. Dosage regimes may be adjusted to provide the optimum 

35 therapeutic response. For example, several divided doses may be administered daily or the dose may be 
proportionally reduced as indicated by the exigencies of the therapeutic situation. 
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The active compound may be administered in a convenient manner such as by injection 
(subcutaneous, intravenous, etc.), oral administration, inhalation, transdeimal application, or intracerebral 
administration. 

A pharmaceutical composition of the invention can be administered to a subject in an appropriate 
5 carrier or diluent, co-administered with enzyme inhibitors or in an appropriate carrier such as microporous or 
solid beads or liposomes. The term "pharmaceutically acceptable carrier" as used herein is intended to include 
diluents such as saline and aqueous buifer solutions. Liposomes include water-in-oil-in-water emulsions as 
well as conventional liposomes (Strejan et al., (1984) J. Neuroimmunol 7:27). The active compound may also 
be administered parenterally or intraperitoneal ly. Dispersions can also be prepared in glycerol, liquid 

10 polyethylene glycols, and mixtures thereof and in oils. Under ordinary conditions of storage and use, these 
preparations may contain a preservative to prevent the growth of microorganisms. Depending on the route of 
administration, the active compound may be coated to protect the compound from the action of enzymes, acids, 
and other natural conditions which may mactivate the compound. 

Thersqieutic administration of polypeptide modulators may also be accomplished using gene therapy. 

IS A nucleic acid including a promoter operatively linked to a heterologous polypeptide may be used to produce 
high-level expression of the polypeptide in cells transfected with the nucleic acid. DNA or isolated nucleic 
acids may be introduced into cells of a subject by conventional nucleic acid delivery systems. Suitable delivery 
systems include liposomes, naked DNA, and receptor-mediated delivery systems, and viral vectors such as 
retroviruses, herpes viruses, and adenoviruses. 

20 The following non-limiting examples are illustrative of the present invention: 

EXAMPLE 1 

Crystals of alpha- 1,3 -matmosyl-glycoprotein beta-l,2-N-acetyIglucosaminyltransferase (GnT-1) were 
25 grown by the vapour-diffiision method from protein drops containing 10 mg/ml GnT-1, 10 mM MES buffer, 
pH 5.5, 270 mM KCL, 2-5 mM MnCl2, and 10 mM UDP-GlcNAc, mixed with, and equilibrated against, 15- 
25% polyethylene glycol 8000, 100 mM Tris buffer, pH 7.9, 0 to 5% glycerol, and 0 to 10% isopropanol. 
Plate-like crystals grew within a few days, in space group P2i2|2i (a= 40.4 A, b= 82.4 A, c=* 102.5 A, a=p=7 
*=90*), with one molecule in the asynmietric unit, and 40% solvent content.Data was collected from the crystals 
30 flash-frozen in a lOOK N2 stream, after a ten-minute wash with 21% polyethylene glycol 8000, 15% glycerol, 
and 100 mM Tris buffer, pH 7.9. 

Atomic structural coordinates of an N-acetylglucosaminyhransferase I are set out in Table 1 . Atomic 
coordinates of an N-acetylglucosaminyltransferase I with bound MeHg are set out in Table 2. The atomic 
structural coordinates of a rabbit N-acetylglucosaminyltransferase 1 bound to UDP-GlcNAc and a manganese 
35 2+ ion are shown in Table 3. Atomic structural coordinates of an N-acetylglucosaminyltransferase 1 with 
acceptor are shown in Table 4. Figures 1 to 26, 28 to 30, and 32 to 40B illustrate glycosyltransferase 
structures, or binding sites or domains thereof 
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EXAMPLE2 

The x-ray crystal structure of a soluble fragment containing the catalytic domain of a rabbit 
5 (Oryctolagus cunicuius) GnT 1 was determined at 1.4 A resolution. The 342 residue catalytic domain of GnT I 
was expressed as an N-terminal histidine-tagged fusion protein (Sarker et al, Glycoconjugate J. 15:193-197, 
1998), using the baculovirus/Sf9 system in a 3.5 litre bioreactor. The protein was purified using a CM HyperD 
F column, followed by a Ni-afiRnity column. The histidine tag was removed by enterokinase cleavage. Crystals 
were grown using the hanging-drop vapor diffusion method, from drops containing 10 mg/ml protein, 10 mM 

10 MES pH 6.5, 250 mM KCL, 2 mM MnCb, and 10 mM UDP-GlcNAc, and wells containing 17.5%-19.5% 
PEG 8000, 5% glycerol, and 100 mM Tris-HCl pH 7.9 Native and two-wavelength mercury-derivative data 
were collected using frozen ciystals on the F2 beam line at the Gomel! High Energy Synchrotron Source. The 
crystals grow in space group P2i2i2i, with cell parameters a = 40,4 A, b=82.4 A, c = 102.5 A. The structure 
was solved using the multiwavelength anomalous dispersion technique. GnTI contains both an eight-stranded 

IS mixed beta-sheet, flanked by six alpha helices, and a four-stranded mixed beta sheet, backed by three alpha 
helices. The structure reveals diat the catalytic domain has dimensions 54 A x 52 A x 37 A, with a large pocket 
on one face capable of holding both the UDP-GlcNAc donor and the Man5Gn2 acceptor. Sequence comparison 
shows that residues found in the pocket are very well conserved among GnT I sequences from different 
species. The pocket is flanked by a loop, not seen in the electron density map, which plays a role in either 

20 catalysis or substrate binding. 

EXAMPLE 3 

X-ray Crystal Structure of N-AcetylglucosaminyUransferase I: Structure, Mechanism, and the SGC 
25 Superfamiiy 

Overall Structure 

The catalytic fragment of rabbit GnT I (residues 106-447; Sarkar et al, 1998) was crystallized in the 
presence of UDP-GlcNAc and Mn^*, and solved by the multi-wavelengdi anomalous diffraction (MAD) 
phasing method using a methybnercury chloride derivative (Table 6). In particular, crystals were grown using 

30 the hanging drop vapor diffiision method, by mbcing equal 1 .5 ^1 volumes of protein solution (10 mg/ml GnT I 
catalytic fragment, 10 mM MES buffer, pH 5.5, 270 mM KCl. 2 mM MnClj and 10 mM UDP-GlcNAc) with 
well solution (15-25% polyethylene glycol 8000, 100 mM Tris buffer, pH 7.9, and 5% glycerol), and 
equilibrating against 1 ml of the well solution. A mercury derivative was obtained by soaking a crystal in well 
solution containing 20 mM MeHgCl. All data was collected using Quantum 4 charge-coupled device detectors 

35 on the F2 beamline of the Cornell High Energy Synchrotron Source, using crystals flash-frozen in the 100 K 
N2 stream. Data were integrated, scaled, and reduced with DENZO and SCALEPACK (Otwinowski and 
Minor, 1997). The mercury position was identified with SOLVE (Terwilliger and Berendzen, 1999), and 
refined using SHARP (La Fortelle and Bricogne, 1997). Solvent flattening and histogram matching were 
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performed using DM (Cowtan, 1994). The resultant experimental map was traced using the program O (Jones 
et al, 1991), and the model refmed with multiple rounds of manual rebuilding using O and OOPS2 (Kleywegt 
and Jones, 1996), alternated with simulated annealing and positional and B-&ctor refinement using CNS 
(Brunger et ak, 1998). The initial model was also refmed against the ''native'* and **complex" data in a similar 
5 fashion. 

In total, the structure was refined against data sets from 3 different crystals. Of these, no bound 
nucleotide sugar or Mn^* ion was observed in the mercury "derivative" (refined at 1 .4 A resolution) nor in the 
structure refmed against the data set that was termed "native" (1 .5 A resolution). Unlike that found in these apo 
structures, both components were seen in the "complex" (1.8 A resolution). Since the native and derivative 

10 data sets were collected on samples that had aged before the x-ray data were collected, it was assumed that in 
these cases the UDP*GlcNAc had been hydrolysed. 

GnT I is a two-domain protein, with overall dimensions of approximately 65 A x 40 A x 50 A (Figure 
25). The N-terminal domain (domain 1 : residues 106-317) is an eight-stranded mixed p-sheet (3l-p8), flanked 
by six a-helices (al-a6) and a small two-stranded antiparallel P-sheet (p4' and ^8*). The smaller C-terminal 

15 domain (domain 2: residues 354-447) is a four-stranded mixed p-sheet (p9, plO, pi3 and pi4), flanked by 
three a-helices (a7-a9) and a short p-finger (p 1 1 and p 12). The two domains are connected by a linker region 
(residues 33 1 to 353) which wraps halfway around domain 1 before starting the first helix of domain 2. The 
-1050 A^ interface between domain 1 and domain 2 is quite hydrophiiic, and contains 20 bridging water 
molecules. The residues buried in the inter&ce on domain 1 are 53% polar, while those in domain 2 are 36% 

20 polar. 

The a-helices a3, a5 and a6 sit on *top" of the central p-sheet and create a pocket for the nucleotide- 
sugar and oligosaccharide acceptor. Electrostatic potential analysis shows that this pocket is largely acidic, in 
contrast to the rest of the protein surface, which is primarily positively charged. The nucleotide sugar itself sits 
between helicies a3 and a6 and p-strands pi, P2 and p4. The topology and structure of p-strands pi to p4, 

25 and helices al to (x3, are similar to those of the corresponding elements in domains possessing the Rossman 
fold, however, the orientation of the nucleotide sugar with respect to these elements is not. 

In the native and derivative structures, in which UDP-GlcNAc and the Mn^* ion were not observed, 
there is also no electron density for the 13-residue loop (residues 318-330) adjacent to the nucleotide-sugar 
bmding site. The "missing loop" is presumed to be disordered in these crystals, as SDS-PAGE analysis of 

30 washed crystals showed the protein to be intact. These residues are structured in the complex, and are found to 
fonm a flap that partially covers the UDP-GIcNAc moiety. Although structured by UDP-GlcNAc binding only 
the tip of the loop makes direct interactions with it Approximately 50 A^ is buried between the tip of the loop 
and the UDP-GlcN Ac phosphates. Structuring the loop also buries --600 A^ of protein sur&ce adjacent to the 
nucleotide-sugar binding site. In these crystals the active site and the loop itself are exposed to a large solvent 

35 channel, and are not involved in crystal contacts. Aside from structuring the loop, there is no major 
conformational change associated with UDP-GlcNAc binding. The native and complex structures show a root- 
mean-squared-deviation (rmsd) of 0.28 A, based on the a-carbon atoms of residues 106 to 317 and 33 1 to 447. 
The Nucleotide Sugar and Metal Binding Sites 
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As shown by the structure of the complex, the UDP-GIcNAc is bound in the anti conformation. The 
uracil ring is sandwiched between 1187 and the Cli5-C14S cysteine bridge, and its N3 and 02 make key 
hydrogen bond interactions with D144 and HI 90, respectively (Table 7 and Figure 33 A). Moreover, its C5 is 
in van der Waals contact with V321, part of the loop structured by UDP-GlcNAc binding. The ribose 02* and 
5 03* atoms make a water-mediated and a direct hydrogen bond, respectively, with the carboxyl side chain of 
D2I2. This Asp is the middle residue in the DXD motif common to a number of glycosyltransferases, as will 
be discussed in detail below. 

The Mn^* ion shows an octahedral geometry coordinated by six "inner-sphere" oxygen atoms (Table 
7 and Figure 33B). The a- and p-phosphate of UDP-GlcNAc each contribute a coordinating oxygen atom, as 

10 do three water molecules. These water molecules are, in turn, hydrogen bonded to "outer sphere" protein 
residues E211, D2I3, T315 and G317. The remaining high-energy inner sphere metal ligand is provided by 
the carboxyl group of D213 — the only direct interaction with the protein. As such, it seems that GnT I does 
not have an independent metal binding site capable of binding Mn^* in the absence of UDP-GlcNAc. In 
addition to coordinating the Mn^* ion, the phosphates make direct interactions with the protein. The a- 

15 phosphate makes a salt bridge with Rl 17 and a hydrogen bond to the amide nitrogen of V321, and the p- 
phosphate hydrogen bonds to the hydroxyl group of S322. These interactions with V321 and S322 are an 
important component of the UDP-GlcNAc dependent structuring of the loop. Overall, the phosphates are in a 
conformation typical of divalent metal-boimd nucleotides (Black et al, 1994). 

Finally, the GlcNAc moiety itself makes several interactions with the protein (Table 7 and Figure 

20 33C). The vicinal 03 and 04 hydroxyls are hydrogen bonded with the carboxyl group of £21 1 in a fashion 
seen in nuuiy lectin-carbohydmte complexes (Vyas, 1991). The 04 hydroxyl appears to play a central role, as 
it also makes a strong hydrogen bond with W290. The 06 hydroxyl is hydrogen bonded to a tightly bound 
water molecule seen in both the apo and complex structures, van der Waals interactions are also important, 
most notably between the iV-acetyl methyl group and the side chains of L269 and L331. 

25 The Glycosyltransferase DxD Motif 

The DxD motif has been identified in many glycosyltransferase families and is thought to be involved 
in Mn^* ion binding. The motif contains two Asp residues and is typically flanked by apolar residues 
(hhhhDxOxh) (Wiggins and Munro, 1998). (See Figures 27 and 31 for DxD motif alignments.) Site-directed 
mutagenesis has shown that both Asp*s are required for yeast a-l,3-mannosyltransferase activity (Wiggins and 

30 Munro, 1998). In GnT I, the motif is present in a modified form (^"EDD^*^, and with L214 fonns the i to i+3 
residues of a type 1 p-mm connecting p-strands p4 and p4' (Figure 39). As such, the highly conserved acidic 
residues are directed toward the same face of the turn. The fact that p4 runs through the core of the protein is 
consistent with the observed presence of several apolar residues on the N-teiminal side of the motif. 

The interactions with UDP-GlcNAc and the Mn'* ion illustrate the importance of the motif. As 

35 discussed above, the second conserved Asp (D213) makes the only direct interaction with the bound Mn^'^ ion. 
In addition, it makes a hydrogen bond with one of the metal coordinating water molecules, which itself is 
hydrogen bonded to the first conserved Asp (E21 1). Overall, these residues are conform ationally constrained 
by the well-defined octahedral geomeuy characteristic of Mn^^ ion coordination. Since the phosphates of the 
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nucleotide sugar also coordinate the manganese ion, it serves to define the relative orientation of the nucleotide 
sugar and the conserved acid residues. In the case of GnT 1 this positions the GlcNAc moiety of the donor 
sugar for interaction with the first residue of the motif. In other sugar nucleoside diphosphate/Mn^^-dependent 
glycosyltransfeiases, the first Asp of the DxD motif would be expected to play a carbohydrate-binding role, 
5 regardless of the nucleotide sugar type/linkage. It is well known that Asp is a key residue in carbohydrate 
binding proteins, and is thus well suited for such a role. Clearly, the well-conserved DxD motif does not 
simply serve to bind metal, but rather coordinates both the Mn^*^ ion and the sugar moiety of the donor. 
Reaction Mechanism 

Catalysis by inverting glycosyltransferases is believed to involve a general base, such as Asp or Glu, 
10 which serves to assist in the deprotonation of the nucleophilic hydroxyl of the acceptor. In GnT I the only 
residue capable of playing this role is D291, 4.7 A away from the GlcNAc CI (Figure 33C). The stracture 
shows that the acceptor will be able to approach the UDP-GlcNAc donor, so as to permit in-line nucleophilic 
attack and inversion of stereochemistry at the GlcNAc CI. Furthermore, the Mn^*^ ion is disposed to pull 
developing negative charge away from the p-phosphate of the UDP leaving group (a role which may be aided 
IS by Rl 17) (in which the hydration state of the ion is likely to play a crucial role (Cowan, 1998; Dudev et al, 
1999). 

Mechanistically, the reaction is thought to involve an oxocarbenium ion intermediate, similar to that 
proposed for glycosidases. Since glycosidases reduce the activation energy of the hydrolysis reaction by 
binding their substrates in a distorted conformation the GlcNAc ring conformation was examined for a similar 

20 effect. However, there was no evidence of significant distortion suggesting that the UDP-GlcNAc is bound in 
a low energy conformation: the sugar ring is a standard ^Cj chair, and the glycosydic linkage is in an allowed 
conformation (Petrova et al 1999). As such, the UDP-GlcNAc is conceivably no more susceptible to 
nucleophilic attack by water than it would be in solution. Presumably, the activation energy for catalysis is 
derived from acceptor binding. 

25 Loop Structuring and the Acceptor Binding Pocket 

Comparison of the apo and complex structures shows that UDP-GIcNAc bmding structures the 318- 
330 loop, forming a flap that partly covers the UDP-GlcNAc (Figure 40A). As discussed above V321 and 
S322, at the tip of the loop, make hydrogen bonds to the a- and p-phosphates of the UDP-GlcNAc. Residues 
320-323 form a type IV turn, while the C-terminal residues 324-330 make one complete turn of an a-helix. 

30 The loop folds upon itself, burying residue F327 against R318 and the non-loop residues T31S, L331 and 
K332. The only conformational changes other than structuring the loop itself are a peptide flip (F316-G317) 
and a reorientation of the T315 side chain. These changes are critical as the G3 17 carbonyl and the T315 
hydroxyl are repositioned to make hydrogen bonds with two of the Mn^*^ ion coordinating water molecules (see 
Figure 40A and Figure 33B). 

35 As shown in Figure 40B, loop structuring creates a deep pocket, terminating over the proposed 

catalytic base (D291) and the GlcNAc moiety. The pocket itself can accommodate only a single 
monosaccharide residue of the MansGIcNAci acceptor. One complete side of this pocket is formed by the loop 
structured upon UDP-GlcNAc binding. As a result two loop residues (S322 and F326), fully conserved among 



wo 00/78936 PCT/CAOO/00725 

-35- 

active GnT I sequences, are presented to the acceptor binding pocket (Figure 40B). To explore the potential 
roles played by these and other residues in the binding pocket, a mannose residue was modeled into the site. 
With the attacking 02 hydroxy 1 positioned between the Asp 291 OE2 and the UDP-GlcNAc CI, only one 
general orientation leads to reasonable steric and chemical interactions with the protein. In this orientation, the 
5 exocyclic C6 hydroxymethyl group of the mannose interacts with S322 and F326, while the 03 and 04 point 
toward D291, R295 and R415. 

The importance of the mannose 03, 04 and 06 predicted by this model is consistent with substrate 
studies using synthetic analogues of the trimannose core of the acceptor (Moller, 1992; Reck, 1995). In these 
studies it was further shown that even in the trimannose core, the known specificity of GnT I for the Manal,3- 

10 arm over that of the Manal,6-arm of the acceptor is preserved. This specificity is presumably dictated by 
interactions involving the p-mannose 04, the only other trisaccharide hydroxyl group found to be important. 
Extending the model to include all residues of the trimannose core (in its solution conformation) (Brisson and 
Cowen, 1983), the p-mannose 04 is positioned to interact with either D291 or D292. Similar interactions 
cannot be made when the 6-arm mannose is positioned in the binding pocket. Presumably the incoming 

15 nucleophile and associated binding energy serve to drive the reaction toward the transition state and ultimately 
product formation. 
Enzyme Kinetics 

Analysis has shown that GnT I proceeds through an ordered sequential Hi Bi kinetic mechanism 
(Nishikawa et al, 1988). The enzyme first binds Mn^*/UDP-GlcNAc and then the Man5GlcNAc2 acceptor; the 

20 carbohydrate product is released first, followed by UDP. The GnT I structures provide an explanation for 
these observations. Since UDP-GlcNAc binding is required to structure the loop, and create the acceptor 
binding site, it is clear that the nucleotide sugar must bind first. Once catalysis has occurred, the UDP product 
cannot maintain the loop in its structured confomiation, the acceptor binding pocket is destroyed, and the 
oligosaccharide product released. UDP, which is bound more weakly to GnT I than UDP-GlcNAc, is then free 

25 to diffuse out of the binding site, to be replaced by a fresh molecule of UDP-GlcNAc. By destroying the 
acceptor/product binding pocket, these kinetics also ensure that the enzyme is not strongly inhibited by the 
oligosaccharide product 

The structure also shov/s that GnT I does not itself have a Mn^^ ion binding site — there is only a 
single direct protein-metal interaction. The Mn'* ion is clearly more fiilly coordinated by UDP-GIcNAc, and 

30 positioned on the surface of the protein by virtue of its interactions with the nucleotide sugar. This mode of 
binding may also be an important determinant of how the enzyme releases its products. In the absence of an 
independent metal binding site, the UDP-Nto^"^ complex would be free to dissociate from the enzyme surface, 
once catalysis has occurred. 

The suggestion that bound UDP cannot support loop structuring stems from an analysis of the loop's 

35 interactions with UDP-GlcNAc in the complex. As discussed earlier, two residues (V321 and S322) at the tip 
of the loop form hydrogen bonds with oxygen atoms from the two phosphates. The loop's interactions are not 
otherwise very extensive, altogether burying only 50 of the bound nucleotide sugar. Once the bond 
between the GIcNAc CI and the p-phosphate oxygen is broken, the temiinal phosphate acquires an additional 
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negative charge and presumably greater mobility (the latter enhanced by the lack of an independent Mn^* ion 
binding site). Together, these effects would be expected to disrupt the ability of the phosphates to structure the 
loop. As such, it would seem that the structured loop can be thought of as a sensor for the mtegrity of the 
GlcNAc-phosphate linkage, thereby regulating formation and destruction of the acceptor/product binding site. 
S The SGC Domain 

Analysis shows that the structure of domain 1 of GnT 1 is veiy similar to that of the B. subtilis 
glycosyltransferase spsA (residues 2-217) (Chamock and Davies, 1999). It possesses an identical topology, 
and all of the major secondary structural elements characterizing the domain are found in both structures 
(Figure 29). The domain is also found, with some modification in secondary structure (the topology remains 

10 the same), in p4Gal-Tl (residues 180-346) (Gastmel et al, 1999), and GhnU (residues 4-227, Figure 30) 
(Brown et al, 1999). Structural alignment using the program DALI, yields Z-scores of 15.7, 10.6 and 9.8, with 
spsA, p4Gal-Tl and GlmU, respectively. The very strong structural similarity between GnT I domain 1 and 
spsA suggests the existence of a canonical core domain, the SGC domain (spsA GnT I core domain), 
represented, in these four structures. 

IS Despite the structural similarity shown by these enzymes, they do not show significant sequence 

similarity. Even with a knowledge of the structural alignment, GnT I shows only 10%, 12%, and 7% sequence 
identity with spsA, |34GaI-Tl, and GbnU, respectively. These levels of identity make it difficult, if not 
impossible, to establish whether or not these enzymes stem from a common ancestor. Analysis of residues 
critical for function may, however, shed light on this question. The position of the UDP moiety in the GnT I 

20 complex is virtually identical to that found in the spsA complex (Figure 29) and is also very similar to that seen 
in the p4Gal-Tl and GhnU complexes. Moreover, the DxD motif is present in all four of these proteins and 
forms a perfectly superimposable type 1 p-tum in each case. Finally, at position D291, the proposed catalytic 
base in GnT I, both glycosyltransferases, spsA (D191) and p4Gal-Tl (D318), also possess Asp residues. Not 
only are these key residues and functional features identical in these structures, they are found at the same 

25 position on the structural/topo logical framework. The low sequence identity, common fold, and related 
functional features define the SGC superfamily, whose members are therefore likely to share a common 
evolutionary origin (Murzin et al, 1995). 
The SGC Superfamily 

The lack of sequence identity between glycosyltransferases with different specificities has lead to a 
30 classification that now includes 44 glycosyltransferase families. GnT I, for example, is in a family of its own, 
and a Position-Specific Iterated BLAST (PSI-BLAST) search, using the GnT I sequence, identifies no other 
related glycosyltransferases. Based on the knowledge that the GnT I SGC domain is structurally similar to 
SpsA, an attempt was made at finding sequence similarity between these and other glycosyltransferases, 
thereby extending the SGC superfamily. The spsA sequence, coming firom a much larger glycosyltransferase 
35 family, containing many divergent sequences, provides a more robust profile, and it was used to seed a PSI- 
BLAST search (Altschul et al, 1997). The search was able to identify similarity between spsA (family 2) and 
rabbit GnT I (^ily 13). It also showed similarity between spsA and the p-l,4-GalNAc transferases (^ily 
12), the ceramide glucosyltransferases (family 21) and the polypeptide GalNAc transferases (family 27); 
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neither P4GaI-Tl (family 7), nor GlmU appeared in the searches. 

To further explore possible relationships among the glycosyltransferase &milies, protein threading 
was used to determine the compatibility of a number of glycosyltransferase sequences with the SGC domain. 
Using the program THREADER 2, a single arbitrarily-selected sequence from each of the 27 
glycosyltransferase families described by Campbell et al. (Campbell et al, 1998; Campbell et al, 1997) were 
run against a database of 1900 structures, which included the SGC domain of GnT I, spsA, p4Gal-Tl and 
GlmU. In both the normal and randomized test scores, the selected sequence from femily 2. family 7. and 
family 13 ranked first or second against the SGC domain of spsA, P4Gal-T] and GnT I, respectively, as would 
be expected. The sequence from family 3, family 6, family 16 and family 26 also ranked first or second in the 
two tests; sequences from several other families also received high scores. These results, and those based on 
PSI-BLAST searching, suggest that the SGC domain is widely represented among different families and 
includes both inverting and retaining glycosyltransferases. 

Table 8 shows protein threading results. Proteins from different fiamilies were threaded against a 
THREADER 2 database containing 1900 protein folds, including GnT 1, spsA, GlmU, and p4Gal-TI. The 
folds were sorted on the basis of their filtered combined energy Z-scores. When a GTCD-1 -containing fold was 
one of the top thirty hits, out of 1900, then the top thirty hits were rerun with a randomization test of fifty 
shuffled-sequence threadings for each fold, to give a combined energy shuffled Z-score. A correct prediction 
should score well in both tests. Note that not only are inverting families represented, but so are retaining 
glycosyltransferases. 
Conclusion 

The structure of the catalytic domain of GnT I has provided the basis for its Mn^*/UDP-GlcNAc 
binding properties, as well as insight into both its catalytic and kinetic mechanisms. The structure of the DxD 
motif shows that the first conserved residue plays a role in binding the donor sugar, while the second 
coordinates the essential Mn^* ion. These roles are likely to be conserved in other DxD-containing 
glycosyltransferases, regardless of donor specificity. In addition, structural analysis has defined the SGC 
domain, seen in GnT 1, spsA, P4Gal-'Tl and GlmU. Sequence analysis and protein threading show that the 
SGC domain is contained in em^mes from several of the existing inverting and retaining glycosyltransferase 
families. Among these are enzymes involved in mammalian N- and Olinked oligosaccharide biosynthesis, 
bacterial cell wall production, and the synthesis of glycogen, chitin and cellulose. Together, they constitute the 
SGC superfamily. 

Having illustrated and described the principles of the invention in a preferred embodiment, it should 
be appreciated to those skilled in the art that the invention can be modified in arrangement and detail without 
departure from such principles. All modifications coming within the scope of the following claims are claimed. 

All publications, patents and patent applications referred to herein are incorporated by reference in 
their entirety to the same extent as if each individual publication, patent or patent application was specifically 
and individually indicated to be incorporated by reference in its entirety. In particular, U.S. provisional patent 
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applications Serial Nos. 60/139,949, filed June 18, 1999, 60/161,809, filed October 27, 1999, 60/178,401, 
filed January 27, 2000, and 60/202,509 filed May S, 2000 are incorporated herein by reference. 
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Table 1 



target* 1.5 
target* 2.0 
target^ 2.0 
target^ 2.5 



REMARK GnT-1 native structure, "gntlg" 
REMARK Ulug Unligil, 1999 06 14 

REMARK coordinates from restrained individual B-factor refinement 
E^MARK refinement resolution: 500.0 - 1.5 A 
REMARK starting r= .2186 free_r= .2322 
REMARK final r= .1991 free_r= .2154 
REMARK B rmsd for bonded mainchain atoms= .7 96 
REMARK B rmsd for bonded sidechain atoms= 1.517 
REMARK B rmsd for angle mainchain atoms= 1.237 
REMARK B rmsd for angle sidechain atoms= 2.317 
REMARK wa^ .685709 
REMARK rweight=. 167519 
REMARK target" mlf steps= 60 

REMARK sg= P2 (1) 2 (1 ) 2 (1) a= 40.478 b= 82.423 c= 102.480 alpha= 90 beta= 
90 gamma= 90 

REMARK parameter file 1 : CNS_TOPPAR:protein_rep. param 
REMARK parameter file 2 : CNS_TOPPAR: water_rep . param 
REMARK molecular structure file: generate_easy . mt f 
REMARK input coordinates: bgroup. ann.pdb 
REMARK reflection file= . . /data/gntlg_start . cv 
REMARK ncs« none 

REMARK B-correction resolution: 6.0 - 1.5 
REMARK initial B-factor correction applied to f_w3 : 
REMARK Bll= -.092 B22= 1.661 B33- -1.569 
REMARK B12= .000 B13= .000 B23= .000 
REMARK B-factor correction applied to coordinate array B: 
REMARK bulk solvent: density level= .380844 e/A^3, B-factor 
REMARK reflections with | Fobs | /sigma_F < 0.0 rejected 
REMARK reflections with | Fobs 1 > 10000 * rms(Fobs) rejected 
REMARK anomalous diffraction data was input 
REMARK theoretical total number of refl. in resol. range: 
100.0 % ) 

REMARK number of unobserved reflections (no entry or (F|=0): 6093 { 
5.7 % ) 

REMARK number of reflections rejected: 0 ( 

.0 % ) 

REMARK total number of reflections used: 99934 ( 

94.3 % ) 

REMARK number of reflections in working set: 95035 { 

89.6 % } 

REMARK number of reflections in test set: 4899 ( 

4.6 % } 

REMARK FILENAME="bindividual . ann.pdb" 

REMARK DATE: 14-Jun-99 15:30:36 created by user: ulu 
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Table 2 



.840 
L.595 

,299 
,451 



target= 1.5 
target= 2.0 
target= 2.0 
target= 2 . 5 



REMARK parameter file 1 
REMARK parameter file 2 
REMARK parameter file 3 



REMARK GnT-1 structure with MeHg bound, "gntlf" 
REMARK Ulug Unligil, 1999 06 11 

REMARK coordinates from restrained individual B-factor refinement 
REMARK refinement resolution: 500.0 - 1.5 A 
REMARK starting r= .2545 free_r= .2672 
REMARK final r= .2369 free_r= .2501 
REMARK B rmsd for bonded mainchain atoms« 
REMARK B rmsd for bonded sidechain atoms= 
REMARK B rmsd for angle mainchain atoms= 
REMARK B rmsd for angle sidechain atoms= 
REMARK wa= .901697 
REMARK rweight=. 157734 
REMARK target= mlf steps= 30 

REMARK sg= P2 (1) 2 ( 1) 2 ( 1 ) a« 40.382 b= 82.378 c= 102.487 alpha 

90 gamma= 90 

CNS_TOPP7VR : protein_rep . param 
. . /data/mmc.param 
CNS_TOPPAR : water_rep . pareun 
REMARK molecular structure file: generate_easy .mtf 
REMARK input coordinates: bgr oup . ann . pdb 
REMARK reflection file= . . /data/gntl_start . cv 
REMARK ncs= none 

REMARK B-correction resolution: 6.0 - 1.5 
REMARK initial B-factor correction applied to f_wl : 
REMARK Bll= -.069 322= 1.877 B33« -1.809 
REMARK B12= .000 B13= .000 B23= .000 

REMARK B-factor correction applied to coordinate array B: 
REMARK bulk solvent: density level= .377577 e/A''3, B-factor= 
REMARK reflections with I Fobs I /sigma_F < 0.0 rejected 
REMARK reflections with | Fobs I > 10000 * rms(Fobs) rejected 
REMARK anomalous diffraction data was input 
REMARK theoretical total number of refl. in resol. range: 
100.0 % ) 

REMARK number of unobserved reflections (no entry or |F|=0): 
20.9 % ) 

REMARK number of reflections rejected: 
.0 % ) 

REMARK total number of reflections used: 
79.1 % ) 

REMARK number of reflections in working set: 
75.3 % ) 

REMARK number of reflections in test set: 
3.9%) 

REMARK FILENAME="bindividual . ann . pdb" 



90 beta= 



-.760 

29.956 A-2 
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-6.955 36.149 1.00 11.67 S 
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-11.017 21.146 1.00 17.57 S 
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25.054 25.975 1.00 9.70 S 

.870 4.261 1.00 8.84 S 

-4.248 36.027 1.00 13.00 S 
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13.645 2.488 1.00 13.47 S 
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Table 3 

REMARK GnT I "be" Structure of rabbit GnT I bound to UDP-GlcNAc and a 

REMARK Manganese 2+ ion. Ulug Unligil & Dr. James Rini, Oct 25, 1999 

REMARK coordinates from minimization refinement 

REMARK refinement resolution: 50.0 - 1.8 A 

REMARK starting r= 0.2006 fpee_r= 0.2388 

REMARK final r= 0.1987 free_r= 0.2388 

REMARK rmsd bonds= 0.006698 rmsd angles= 1.36297 

REMARK wa= 1 . 4 

REMARK target= mlf cycles^ 1 steps= 200 

REMARK sg= P2 (1)2(1)2(1) a- 40.541 b= 82.190 c= 101.956 alpha= 90 beta= 90 gamma= 

90 



REMARK parameter file 1 
REMARK parameter file 2 
REMARK parameter file 3 
REMARK parameter file 4 
REMARK parameter file 5 



CNS_TOPPAR : protein__rep . param 
CNS^TOPPAR : ion . param 
. . /. . /da ta/udpglcnac .param 
. . / . . /data/glycerol . param 
CNS^TOPPAR: water_rep . param 
REMARK molecular structure file: .. /alternate. mtf 
REMARK input coordinates: bindividual.bi4 . lO.pdb 
REMARK reflection file= . . / , . /data/gntlbe.cv 
REMARK ncs- none 

REMARK B-correction resolution: 6.0 - 1.8 
REMARK initial B- factor correction applied to fobs : 
REMARK Bll= 4.245 B22= 1.052 B33= -5.296 
REMARK B12= 0.000 B13= 0.000 B23= 0.000 

REMARK B-factor correction applied to coordinate array B: -1.075 
REMARK bulk solvent: density level= 0.415966 e/A'^S, B-factor= 55.91 A'^2 
REMARK reflections with I Fobs I /sigma^F < 0.0 rejected 
REMARK reflections with | Fobs I > 10000 * rms (Fobs) rejected 
REMARK anomalous diffraction data was input 

REMARK theoretical total number of refl. in resol. range: 61022 ( 100.0 % ) 

REMARK number of unobserved reflections (no entry or |F1=0): 18103 ( 29.7 % ) 
REMARK number of reflections rejected: 0 ( 0.0 % ) 
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ATOM 3109 CDl ILE 113 -6.182 14.283 8.794 0.50 2.46 AC2 

ATOM 3110 C ILE 113 -2.460 13.442 6,524 0.50 8.18 AC2 

ATOM 3111 O ILE 113 -1.352 13.144 6.976 0.50 7.65 AC2 
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Table 4 

REMARK Model of GnT I with Acceptor. GnT I "be" with experimental UDP- 

REMARK GlcNAc and Manganese 2+ ion, with Man5GlcNAc2 acceptor modeled into 

REMARK the active site.' Ulug Onligil & Dr. James Rini, October 25, 1999. 

REMARK coordinates from minimization refinement 

REMARK refinement resolution: 50.0 - 1.8 A 

REMARK starting r= 0.2113 free_r= 0.2440 

REMARK final r= 0.2103 free_r- 0.2424 

REMARK rmsd bonds= 0.005928 rmsd angles= 1.31456 

REMARK wa= 1.03895 

REMARK target= mlf cycles= 1 steps= 200 

REMARK sg= P2 (1)2 (1)2(1) a= 40.541 b= 82.190 c= 101.956 alpha= 90 beta* 90 gamma= 

REMARK parameter file 1 : CNS_TOPPAR:protein_rep.param 

REMARK parameter file 2 : CNS_TOPP/^: ion . param 

REMARK parameter file 3 : ../../.. /data/udpglcnac . param 

REMARK parameter file 4 : CNS_TOPPAR: water_rep . param 

REMARK parameter file 5 : CNS_TOPPAR: carbohydrate. param 

REMARK molecular structure file: alternate. mtf 

REMARK input coordinates: alternate. pdb 

REMARK reflection file= ../../.. /data/gntlbe. cv 

REMARK ncs^ none 

REMARK B-correction resolution: 6.0 - 1.8 
REMARK initial B-factor correction applied to fobs : 
REMARK Bll- 4.242 B22- 1.045 B33= -5.287 
REMARK B12= 0.000 Bl3= 0.000 B23= 0.000 

REMARK B-factor correction applied to coordinate array B: -0.095 
REMARK bulk solvent: density level= 0.423009 e/A^3, B-factor= 57.5717 A'^2 
REMARK reflections with I Fobs I /sigma_F < 0.0 rejected 
REMARK reflections with I Fobs I > 10000 * rms(Fobs) rejected 
REMARK anomalous diffraction data was input 
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CRYSTl 40.541 82.190 101.956 90.00 90.00 90.00 P 21 21 21 
REMARK FILENAME»"minimize.200.pdb** 

REMARK DATE:24-Oct-1999 23:28:47 created by user: ulu 

REMARK VERSION :0.9a 
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22.873 25.644 1.00 30.27 S 

9.651 14.274 1.00 18.18 S 

-2,770 39.309 1.00 16.58 S 

2.426 50.167 1.00 33.80 S 

16.924 23.159 1.00 32.68 S 

2.300 -7,962 1.00 25.82 S 

16.379 45.492 1.00 36.75 S 

2.145 46.020 1.00 22,60 S 

6.194 39.240 1.00 18.76 S 

-10.168 29.047 1.00 21.32 S 

4.284 31.088 1.00 26.65 S 

27.389 24.081 1.00 30.69 S 

21.757 -4.228 1.00 21.82 S 

10.479 -2.886 1.00 34.62 S 

12.971 27.484 1.00 24.57 S 

1.476 13.722 1.00 27.29 S 

6.943 10.471 1.00 33.74 S 

-2.258 23.786 1.00 42.12 S 

-9.147 18.487 1.00 27.37 S 

0.925 -0.905 1.00 34.57 S 

14.419 22.729 1.00 24.07 S 

3.605 34.093 1.00 35.10 S 

25.898 37.100 1.00 33.19 S 

12.845 -9.138 1.00 34.17 S 

32.975 16.990 1.00 26.97 S 

-7.564 6.737 1.00 35.93 S 

17.049 44.135 1.00 28.48 S 

5.202 6.054 1.00 28.41 S 

27.279 -3.127 1.00 39.41 S 

6.732 -9,043 1.00 24.53 S 

5.364 20,588 1.00 29.42 S 

-1.757 -2.933 1.00 27.98 S 

15.672 -3.972 1.00 23.60 S 

1.253 45.854 1.00 25.81 S 

21.577 42.993 1.00 25.22 S 

-4.405 29.925 1.00 29.96 S 

14.056 44.322 1.00 26.90 S 

19.594 3.031 1.00 27.12 S 

6.576 24.151 1.00 32.90 S 

-10.666 35.339 1.00 22.89 S 

17.888 6.460 1.00 30.59 S 

-1.501 -6.675 1.00 43.27 S 

-9.371 24.429 1.00 21.84 S 

24.812 34.841 1.00 31.56 S 

8.066 5.149 1.00 30.92 S 

16,773 13,800 1.00 31.52 S 

31.822 23.573 1.00 32.45 S 

7.069 46.879 1.00 32.19 S 

1.539 7.902 1.00 36.38 S 

22.411 42.677 1.00 20.68 S 

33.642 4.827 1.00 26.06 S 

15.966 16.129 1.00 28.25 S 

21.703 46.935 1.00 30.57 S 

-0.156 27.372 1.00 23-51 S 

-1.432 5.324 1.00 39.15 S 

-7.471 31.762 1.00 23.96 S 

8.523 -10.613 1.00 29.15 S 

13.681 52.922 1.00 36.22 S 

-0.211 39.413 1.00 27.76 S 

-1.194 7.321 1.00 26.89 S 

18.990 -1.378 1.00 24.02 S 

15.046 24.611 1.00 26.61 S 
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253 -5.486 -4,379 

254 -5.892 4.867 

259 7.639 15.112 

260 17.966 8.018 

262 6.927 6.927 

263 12.270 12.800 

264 -9.913 4.116 

266 -19.276 18.340 

267 -16.663 12.896 

270 -2.3X2 29.497 

271 18.495 14.333 
273 19.487 2.148 

275 5.526 -11.397 

276 7.711 29.938 

278 -10.024 29.043 

279 -9.570 35.739 

280 -8.330 -5.504 

281 -12.008 -2.272 

282 -18.223 17.511 

283 9.967 32.965 

284 -20.025 20.841 

285 -3.135 -3.705 

286 -15.605 -0.858 

287 22.736 8.293 

288 17.524 0.865 

289 7.381 0.344 

290 -9.434 31.287 

292 14,395 0.506 

293 -18.885 7.379 

294 1.506 21.499 

295 -15.165 28.115 

296 -3.176 23.510 

297 28.692 6.135 

298 -21.550 6.040 

299 -16.657 27.596 

300 2.591 21.757 

301 24.297 5.640 

302 17.500 18.253 

303 21.687 0.862 

304 -17.710 13.870 

305 -9.948 28.208 

306 14.234 -1.991 

307 -6.703 29.552 

308 3.310 6.855 

309 13.321 25.980 
501 3.440 22.064 
501 2.691 21.140 
501 2.239 20.000 
501 3.626 20.714 
501 2.960 19.787 
501 4.907 20.079 
501 5.832 19.915 
501 5.551 20.961 
501 4.564 21.375 
501 6.647 20.230 

501 7,161 21.023 

502 -0.753 27.426 
502 -0.025 27.235 
502 1.353 27.536 
502 -0.628 28.096 
502 0.329 29.029 
502 -1.875 28.833 



2.498 1.00 25,72 S 

-3.600 1.00 30.28 S 

-4.183 1.00 19.48 S 

16-377 1.00 50.14 S 

51.120 1.00 36.10 S 

27.411 1.00 38.90 S 

-3.359 1.00 35.97 S 

15.646 1.00 23.56 S 

25.800 1.00 20.01 S 

25.547 1.00 13.85 S 

23.517 1.00 24.52 S 

39.357 1.00 21.73 S 

20.474 1.00 40.91 S 

34.474 1.00 22.25 S 

4.853 1.00 20.31 S 

3.011 1.00 24.60 S 

27.111 1.00 27,85 S 

29.037 1.00 28.01 S 

0.409 1.00 44.50 S 

36.408 1.00 32.45 S 

25-208 1.00 32.55 S 

1.137 1.00 29.29 S 

36.573 1,00 31.30 S 

50.227 1.00 42.33 S 

40.770 1.00 28.19 S 
-0.365 1.00 38.81 S 

3.731 1.00 29.14 S 

26.802 1.00 31.68 S 

26.307 1.00 34.11 S 

44.520 1.00 27.32 S 

15.669 1,00 24.48 S 

44.563 1.00 34.08 S 

44.148 1.00 34.24 S 

14.130 1.00 40.59 S 

24.184 1.00 40-98 S 

42.148 1.00 27.88 S 

49.289 1.00 50.83 S 

10,340 1.00 35.00 S 

37.246 1.00 32.98 S 

29.687 1.00 33.06 S 

7.254 1.00 39.84 S 

39.400 1.00 39.52 S 

17.319 1.00 26.55 S 

-7.683 1.00 29.35 S 

44.243 1.00 31.85 S 

17.321 1.00 50.88 X 

18.265 1.00 48.50 X 

17.561 1.00 50.47 X 

19.388 1.00 43.43 X 

20-229 1.00 41.15 X 

18.827 1.00 41.19 X 

19.888 1.00 43.86 X 

17.744 1.00 42.11 X 

16.771 1.00 47.10 X 
16.986 1.00 40.18 X 
15.924 1.00 38.11 X 
26.438 1.00 71.88 V 
27.787 1.00 65.50 V 
27.640 1.00 64.12 V 
28.909 1.00 61.97 V 
29.391 1.00 56.59 V 
28.423 1.00 63.11 V 
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UDl 




449 


-0.978 


18.658 


13. 


783 


1.00 


22.83 




ATOM 


3139 


05* 


UDl 




449 


1.940 


20.677 


14. 


818 


1.00 


30.11 






3140 


06* 


DDI 




449 


-0,163 


22.698 


14. 


267 


1.00 


22.21 




ATOM 


3141 


07' 


ODl 




449 


3.670 


17.236 


17. 


507 


1.00 


34.89 




ATOM 


3142 


Nl 


UDl 




449 


-0.675 


19.435 


6. 


886 


1.00 


13.55 




ATOM 


3143 


C2 


UDl 




449 


-1.869 


19.855 


6. 


304 


1.00 


13.16 




ATOM 


3144 


N3 


UDl 




449 


-1.841 


20.980 


5. 


574 


1.00 


12.14 




ATOM 


3145 


C4 


UDl 




449 


-0.776 


21.768 


5. 


366 


1.00 


13.78 




ATOM 


3146 


C5 


UDl 




449 


0.523 


21.374 


5. 


953 


1.00 


15.04 




ATOM 


3147 


C6 


UDl 




449 


0.574 


20.242 


6. 


697 


1.00 


13. 86 




ATOM 


3148 


02 


UDl 




449 


-2.955 


19.293 


6. 


419 


1,00 


14.23 




ATOM 


3149 


04 


UDl 




449 


-0.859 


22.730 


4, 


614 


1.00 


12. 86 




ATOM 


3150 


CI* 


UDl 




449 


-0.651 


18. 192 


7. 


687 


1.00 


14-93 




ATOM 


3151 


C2* 


UDl 




449 


0. 496 


17.207 


7 . 


363 


1.00 


13.22 




ATOM 


3152 


02* 


UDl 




449 


0.143 


16.450 


6. 


194 


1.00 


14.37 




ATOM 


3153 


C3* 


UDl 




449 


0.615 


16.401 


8. 


681 


1.00 


14.29 




ATOM 


3154 


C4* 


UDl 




449 


0.139 


17.427 


9. 


745 


1.00 


14.97 




ATOM 


3155 


04* 


UDl 




449 


-0.534 


18.483 


9. 


060 


1.00 


13.43 




ATOM 


3156 


03* 


UDl 




449 


-0.330 


15.327 


8. 


642 


1.00 


16.70 




hTOH 


3157 


C5* 


UDl 




449 


1.320 


18.119 


10. 


503 


1.00 


15.80 




ATOM 


3158 


05* 


UDl 




449 


2.300 


18.744 


9. 


647 


1.00 


15.45 




ATOM 


3159 


PA 


UDl 




449 


3.840 


18.566 


9. 


826 


1.00 


18.59 




ATOM 


3160 


OlA 


UDl 




449 


4.414 


18.996 


8. 


518 


1.00 


14.88 




ATOM 


3161 


02A 


UDl 




449 


4.146 


17.168 


10. 


092 


1.00 


16.57 




ATOM 


3162 


03A 


UDl 




449 


4 .257 


19.452 


10. 


954 


1.00 


19.35 




ATOM 


3163 


PB 


UDl 




449 


4.449 


19.218 


12. 


489 


1.00 


25.38 




ATOM 


3164 


OIB 


UDl 




449 


5. 459 


20.138 


13. 


059 


1.00 


28.03 




ATOM 


3165 


02B 


UDl 




449 


4.753 


17 .787 


12. 


765 


1 .00 


21.55 




ATOM 


3166 


MN+2 


MN2 




448 


5.258 


16.175 


11. 


593 


1.00 


13. 36 




ATOM 


3167 


N 


ILE 




113 


-3.786 


11. 902 


7. 


815 


0.50 


7.78 


AC2 


ATOM 


3168 


CA 


ILE 




113 


-3.713 


13.286 


7. 


360 


0.50 


7.25 


AC2 


ATOM 


3169 


CB 


ILE 




113 


-3.651 


14.267 


8. 


555 


0.50 


6.73 


AC2 


ATOM 


3170 


CG2 


ILE 




113 


-3.633 


15.708 


8.051 


0.50 


6.66 


AC2 


ATOM 


3171 


CGI 


ILE 




113 


-4.858 


14.050 


9. 


473 


0.50 


4.62 


AC2 


ATOM 


3172 


CDl 


ILE 




113 


-6.196 


14.256 


8. 


790 


0.50 


2.31 


AC2 


ATOM 


3173 


C 


ILE 




113 


-2.458 


13.445 


6. 


508 


0.50 


8.03 


AC2 


ATOM 


3174 


0 


ILE 




113 


-1.350 


13.164 


6. 


965 


0.50 


7.50 


AC2 



END 
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Table 5 Inteimolecular contacts of GnT-l-UDP-GlcNAc Complex and GnT-l-MansGlcNAci Complex 



No. of Atomic 
Interaction 


Nucleotide Sugar 
Donor or Acceptor 
Atomic Oontact 


Enzyme Atomic 
Contact 


Distance Between 
Atomic Contacts 


Atomic Interaction 
Property 


1 


Uracil 02 


His-190ND1 


2.7 


HB 


2 


Uracil N3 


Asp 144 


2.8 


HB 


3 
4 


Uracil Ring 


Cys 115-Cys 145 
He 187 


3.7 
J.o 


VW 
V w 


5 


Uracil C5 


Val 321 


3.6 


VW 


6 
7 


Ribose 03'(H) 
02'(H) 


Asp 212 
Asp 212 


2.9 

3.2 direct, and via 
water: 2.9 to water, 
& 3.0 to Asp212 


HB 
HB 
HB 
HB 


8 
9 
10 


a Phosphate 
P-phosphate 


Arg 117NH 

Val 321 
oer 


2.8 
2.7 


SB 
HB 


11 
12 


Loop Structure 
a-phosphate 


Val 191 

V <X1 J.^ 1 

Asp 116 


1 7 

via 2.8 water, 2.8 to 
second water, 2.7 to 


HB 

HB.HB 
HB 


13 


P'phosphatc 


Ser322 


Asp 
2.5 


HB 


14 


GlcNAc 03 


G!u211 


2.7 


HB 


15 
16 
17 


06 


Phe 289, 
Trp290 
Tyrl84 


via 2.7 to water , 2.8 
3.2 
2.9 


HB,HB 

HB 

HB 


18 
19 


04 


Glu211 
Tip 290 


2.6 
2.8 


HB 
HB 


20 
21 


CH3 


Leu 269 
Leu 331 


3.4 
3.3 


VW 
VW 


22 


a- 1,3, mannose 
02 


Asp 291 ODl 


2.4 


HB 


23 
24 


03 


Asp291 
Arg 295 


3.1 

2.9 


HB 
HB 


25 

26 
27 


04 

06 
C6 


Arg 415 

Ser 322 
Phe 326 


via 2.6 water 2.5 to 
Arg 

2,6 
3.6 


HB 

HB 
VW 



HB: hydrogen bond interaction 
VW Nad derWaals 
SB: salt bridge 
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Tahle 6 Crvstanograph ic data and refinement statistics. 



Derivative (MeHgCl) 



Edge 



Peak 



Native Complex with UDP-GlcNAc 
and Mn^ 



Crystal parameters: 
Space group 
a (A) 
P(A) 
y(A) 

Diffraction statistics: 
Wavelength (A) 
Resolution Range (A) 
Measured reflections (n) 
Unique relections (n) 
Conipleteness (%) 

Sites (n) 
Phasing Powcrf: 

Dispersive 

Anomalous 
Figure of Merit, before 

Solvent flattening 

Refmetnent statistics: 
Raja 

Total atoms (n) 
Protein 
Substrate 
Water 
Rmsdt bond length (A) 
Rmsd bond angle (") 
Mean B value (A») 
Protein 
Domain 1 (106-317) 
Loop (318-330) 
Linker (33 1-353) 
Domain 2 (354-447) 
Substrates 
Water 





P2,2,2, 




P2,2,2, 


P2,2,2, 




40.4 




40.5 


403 




82.4 




82.4 


82.2 




102.5 




102.5 


10X0 


1.0093 




1.0075 


0.9914 


1.0713 


31.72-1.4 


31 72 - 1 4 


38.24- 1.5 


34.25 - 1 .8 


348028 






401605 


64537 


102627 




102213 


99934 


42919 


78.7 




78 4 


94.2 


70.3 


0.047 






0.065 


0.092 


1 




1 










1 64 






2.26 










0.581 










0,167 




• 


0.166 


0.185 


0.189 






0.194 


0.229 


3204 






3167 


3138 


2710 






2710 


2811 


0 






0 


40 


492 






457 


275 


0.011 






0.013 


0.010 


1.5 






1.6 


1.5 


14.2 






14.4 


16.2 


12.3 






12.3 


16.0 


11.5 






113 


14.2 










28.3 


12-1 






12.1 


15.4 


14.1 






14.6 


18.7 










23.0 


26.6 






27.9 


25.9 



* Rsym - U- <^ I / A where / is the observed intensity and <I> is the average intensity obtained from multiple 
observations of symmetry-related reflections, t Phasing power, root mean square (rms) Fh/ttos £, where e is lack of 
closure and Fa is the calculated heavy atom structure factor. X Rmsd, root mean squared deviation 



SUBSTITUTE SHEET 
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Tablc 7 

The XJDP-GIcNAc binding site. Listed here are the distances between the UDP-GlcNAc, the Mn^*, bound 
waters, and the protein atoms involved in their binding. 



Interacting Atoms 


Distance (A) 


Interacting Atoms 


Distance (A) 


Uracil N3 


D144 002 


2.8 


GlcNAc 06 


Ha04 


2.7 


Uradi 02 


H190ND1 


2!7 


Mn** 


D213 0D2 


2A 


Ribose 02* 


D212 OD1 


3.2 




H2O38 


2.4 


RIbose 02' 


H2O40 


2.9 


Mn** 


H2O87 


2.4 


Ribose 03' 


D212 OD1 


2.9 


Mn** 


H2OII6 


2.1 


a-Phosphate 01 A 


V321 N 


2.7 


H2O4 


Y184 0 


2.9 


a-Phosphate 01A 


H20 72 


2.8 


H2O4 


F289N 


2.8 


a-Phosphate 02A 


R117NH2 


2.8 


H2O4 


W290N 


3.2 


a-Phosphate 02A 




2.1 


H2O27 


L269N 


3.0 


p-Phosphate01B 


S322 0G 


2.5 


H2O38 


E211 OE1 


2.4 


p-Phosphate 02B 


Mn'* 


2.1 


H2O38 


D213 0D1 


2.8 


GlcNAc 07 


HsO 263 


2.8 


H2O40 


D212 0D2 


3.0 


GtcNAc 03 


E2110E1 


2.7 


H80 87 


T315 0G1 


3.0 


GlcNAc 03 


H20 27 


2.6 


H2OII6 


G317 0 


2.6 


GlcNAc 04 


E211 OE2 


2.6 


H2O 263 


D291 OD1 


2.9 


GlcNAc 04 


W290 NE1 


2.8 


H2O 263 


R295 NH2 


3.0 
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Tabl 8 

Protein threading results. Proteins from different families were threaded against a THREADER 2 database 
containing 1900 protein folds, including GnT 1, spsA, GlmlJ, and j54Gal-Tl. The folds were sorted on the 
basis of their filtered combined energy Z-scores. When a GTCD-1 -containing fold was one of the top thirty 
hits, out of 1900, then the top thuty hits were rerun with a randomization test of fifty shuffled-sequence 
threadings for each fold, to give a combined energy shuffled Z-score. A correct prediction should score well 
in both tests. 



Fsmily 


Class 


Protein (GenBank Gl numt)er) 


Top GTCD-1 - 


Z-score (rank) 


RarKjomizatiort Test 






containing Hit 




2-score (rank) 




InvcrtinQ 


Petunia x hybrida UOP^rtiamnose anthocyanidin-3- 


M6al-T1 


2.33 (10) 


3.28 (8) 






glucosideriiamnosyhransferase (397567) 










Invertina 

II V Wl Ml 


H. influenzae IgtD (1074167) 


SDSA 


2.02 (2) 


4.59 (1) 




DAtainina 


& oerBMsias GIvcoaen fStarchI Svnthass Isoform 


p4Gal-T1 


2.90 f21 


4 47 f1) 






1 (136753) 








4 


Retaining 


Salmonelta typhimurium Lipopolysacxharide 1 .2*N- 


GlmU 


2.63 (3) 


3.81 (4) 






ac^ylglusosaminetransferase rfaK (132488) 












Shigefia dysenteriae galactosy^transferase RfpB 


GImU 


2.61 (5) 


0.52 (25) 






(688322) 








5 


Retsintng 


Triticum aesHvum Granule-bound starch synthase 


GlmU 


2.41 (8) 


0 72 (14) 






(136765) 










Retainina 


Homo ssf^ens histo-blood group A transferase 


GlmU 


3.09 (1) 


2.28 (11 






(340077) 












Synthetic blood group B aipha-1,3- 


GnT 1 / SpsA 


3.12(1)7 2.63 


3.47(3)74.95 (1) 






galactosyltransferase (1041670) 




(5) 




7 


Inv6rtin9 


Lymnaea stagnalis p-l,4-GlcNAc transferase 


p4Qal-T1 


14.98 (1) 


11.75 (1) 


8 


Retainino 


OfyctolaQus cunicuius Glycogenin-I (417075) 


GnT 1 


2.48 (5) 


3 49 (1) 


g 


Invsftina 


Bo/deteOa pertussis rfaC (992970) 


GlmU 


2.59 (6) 


1.31 (10) 


10 


Invertina 

■1 « V vf Ul 1^ 


Homo sapiens Fucosyttransferase 5 (1730135) 


GlmU 


2 90 (1) 


1.60 (16) 


1 1 


InvArHnn 


Hnmo ssoiBns Fucosvltransferase 1 f 120636) 


GlmU 


3.46 (1) 


1.73 (14) 


12 


Invprtino 

II IVCI UI IVJ 


Homo saoiens GM2/GD2 svnthase f1 168736) 


GlmU 


2 80 (2) 


1 24 (10) 


13 


n 1 V w< %M* IM 


C. eiegans gly-14 (3420844) 


GnT 1 


20 26 (1) 


12 34 (1) 


14 




Homo sapmns Cons2 GlcNAo-traitsferase (544360) 


SDSA 


3 1 3 (4) 


5 05 (1) 

W.WW ^ 1/ 


15 


Retainino 


Candida aftUeans putative mannosyltransfefBse 


SDSA 


2.37 (13) 


1 74 (10) 






Mntl (1480086) 








16 


Inverting 


Homo sapiens GnT 11 (1708004) 


spsA 


2.84 (2) 


4.53 (1) 


17 


Inverting 


Homo sapiens GnT 111 (1169979) 


GlmU 


2.85 (2) 


0.66 (15) 


18 


Inverting 


Homo sap/ens GnT V (1 169980) 


GlmU/ GnT 1 


2.82(2)72.52 


2.33 (6)/ 2.41 (4) 


19 


m 


E. COB Hpid A disaccharide synthase (126464) 


GlmU 


(7) 
2.72(5) 


0.86(15) 


20 


Retaining 


A thanana trehalose-6-pho5phate synthase 


GlmU 


Z94(3) 


1.48(9) 






(1865676) 








21 


Retaining 


Homo sapiens ceramlde glucosyltranslierase 


GlmU 


2.81 (1) 


1.08(9) 






(2498228) 








22 


??? 


Homo sapiens PIG-B (1552166) 


GlmU 


2.64 (3) 


0.14 (27) 


23 


Inverting 


Sus scrofa N-acetyt-p-D-glucosaminide a-1.6- 


GlmU 


2.33 (8) 


0.85 (16) 






fucx»yltransferase (1752753) 








24 


Retaining 


Drosophila me/anogaster UDP-glucose glycoprotein 


(3lmU/GnTI/ 


3.54 (1)73.05 


1.94 (2)/ 1.74 (3)/ 






gluoosyttransferase (790584) 


spsA 


(3)7 2.82(7) 


2.03 (1) 






Saecharomyces cerevisiae Killer-toxin resistance 


GlmU /SpsA 


3.05(2)7 2.99 


2.23(5)7 5.39(1) 






protein 5 precursor (2507054) 




(3) 




25 


Inverting 


Haamophitus btRuenzae Lipooligosaccharide 


SpsA 


2.39 (8) 


1.51 (2) 






biosynthesis protein iex-1 (1 170778) 








26 


77? 


Bacillus ^btilis Teichoic add biosynthesis protein A 


GlmU 


3.57 (1) 


7.42(1) 






(135271) 








27 


Retairung 


htomo sapiens polypeptide N- 


GlmU/p4(^l- 


3.06 (2) 7 2.96 


1.75 (14)7 3.19(4)7 






aoetytgalactosaminyttransfefase (1709558) 


Ti/GnTI/ 


(3)7 2.94 (4)7 


4.34(2)72.38 (10) 








SpsA 


2.48 (12) 
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WE CLAIM: 



1. A secondary or three-dimensional structure of a purified glycosyltransferase when it associates 
with a nucleotide sugar donor, acceptor, or metal co&ctor. 

2. A secondary or three-dimensional structure of a purified glycosyltransferase in association with a 
moiety. 

3. A secondary or three-dimensional structure as claimed in claim 2, wherein the moiety is a 
nucleotide sugar donor, acceptor, metal cofactor, or heavy metal atom. 

4. A secondary or three-dimensional structure of a glycosyltransferase as defined in any of the 
preceding claims that is a crystalline form. 

5. A secondary or three-dimensional structure of a glycosyltransferase as defined in any of the 
preceding claims, wherein the glycosyltransferase is an N-acetylglucosaminyltransferase. 

6. A secondary or three-dimensional structure of a glycosyltransferase as defined in any of the 
preceding claims having one or both of the following characteristics: 

(a) an N-termmal domain comprising an eight-stranded mixed p-sheet flanked by six helices, 
and a small two-stranded antiparallel |3-sheet ; and 

(b) a C-termmal domain comprising a four-stranded mixed P-sheet flanked by three a-helices 
and a short p-finger. 

7. A secondary or three-dimensional structure of a glycosyltransferase as defined in claim 6 further 
characterized by the N-terminal domain and C-terminal domain being coimected by a linker 
region which wraps halfway around the N-terminal domain before starting the first helix of the 
C-terminal domain. 

8. A secondary or three-dimensional structure of a glycosyltransferase as defined in any of the 
preceding claims having the structural coordinates of a glycosyltransferase listed in Table 1, 2, 
3, or 4. 

9. A secondary or three-dimensional structure of a glycosyltransferase in association with a sugar 
nucleotide donor having the structural coordinates of a glycosyltransferase and a sugar 
nucleotide donor listed in Table 3. 

10. A secondary or three-dimensional structure of a glycosyltransferase in association with an 
acceptor having the smictural coordinates of a glycosyltransferase and an acceptor listed in Table 
4. 

11. A crystalline form of a glycosyltransferase having a unit cell with dimensions of a = 40.4 ± 3 A, 
b=82.4 ± 3 A, and c = 102.5 ± 3 A. 

12. A crystalline form of an N-acetylglucosaminyltransferase having the structural coordinates listed 
in Table 1, 2, 3, or 4, and a unit cell with dimensions of a = 40.4 ± 3 A, b=82.4 ± 3 A, and c = 
102.5 ±3 A. 

13. A crystalline form as claimed in claim 11 or 12 further characterized by the parameters, 
diffraction statistics, and/or refinement statistics in Table 6. 
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14. A secondary or three-dimensional structure of a binding site of a secondary or three-dimensional 
structure of a glycosyltransferase as defined in any of the preceding claims. 

15. A secondary or three-dimensional structure of a binding site as claimed in claim 14 wherein the 
binding site is defined by its association with one or more of a disphosphate group of a sugar 
nucleotide donor, a nucleotide of a sugar nucleotide donor, a sugar of a nucleotide of a sugar 
nucleotide donor, a selected sugar of a sugar nucleotide donor that is transferred to an acceptor, 
and/or an acceptor. 

16. A secondary or three-dimensional structure of a binding site of a glycosyltransferase as defined 
in the preceding claims wherein the binding site is also defined by the atomic interactions of 
Table 5, preferably the enzyme atomic contacts. 

17. A secondary or three-dimensional structure of a binding site of a glycosyltransferase as defined 
in the preceding claims wherein the binding site is defined by atomic interactions 1 to 5; 6 and 7; 
8, 9 and 10; 1 to 13; 14 to 21; 22 to 27; 1 to 13; 1 to 21; or 1 1, 12, 13, and 27 listed in Table 5, or 
the enzyme atomic contacts for these atomic interactions listed in Table S. 

18- A secondary or three-dimensional structure of an spsA GnT 1 core (SGC) domain of a secondary 
or three-dimensional structure of a glycosyltransferase as defined in any of the preceding claims. 

19. A secondary or three^imensional structure of an SGC domain as claimed in claim 18 
characterized by an eight-stranded mixed P-sheet, flanked by sbc helices, and a small two- 
stranded antiparallel p-sheet 

20. A modulator of die activity of a glycosyltransferase derived from a secondary or three- 
dimensional structure as claimed in any of the preceding claims. 

21. A method of determining three-dimensional structures of polypeptides with unknown structure 
comprising the step of applying the structural coordinates of Table 1, 2, 3, or 4. 

22. A method for identifying a potential modulator of a glycosyltransferase, or binding sites or 
domains thereof, comprising the step of using the structural coordinates of Table 1, 2, 3, or 4 that 
define a glycosyltransferase or binding sites or domains thereof, to computationally evaluate a 
test compound for its ability to associate with the glycosyltransferase, binding sites or domains 
thereof wherein a test compound that associates is a potential modulator of a 
glycosyltransferase. 

23. A method for identifying a modulator of a glycosyltransferase by determining binding 
interactions between a test compound and secondary or three-dimensional structures of binding 
sites as defined in any of the preceding claims comprising: 

(a) generating the binding sites on a computer screen; 

(b) generating a test compound with its spatial structure on the computer screen; and 

(c) testing to detennine whether the test compound binds to a selected number of binding 
sites. 

24. A method for identifying a potential modulator of a glycosyltransferase fimction comprising the 
steps: 
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(a) docking a computer representation of a compound from a computer data base with a 
computer representation of a secondary or three-dimensional structure of a 
giycosyltransferase or a binding site as defined in any of the preceding claims, to obtain 
a complex; 

(b) determining a conformation of the complex with a fevourable geometric fit and 
favourable complementary interactions; and 

(c) identifying compounds that best fit the selected site as potential modulators of the 
giycosyltransferase. 

25. A method for identifying a potential modulator of a giycosyltransferase function comprising the 
steps: 

(a) modifying a computer representation of a compound complexed with a secondary or three- 
dimensional structure of a giycosyltransferase or a buiding site as defined in any of the 
preceding claims, by deleting or adding a chemical group or groups; 

(b) determining a conformation of the complex with a &vourable geometric fit and favourable 
complementary interactions; and 

(c) identifying a compound that best fits the binding cavity as a potential modulator of a 
giycosyltransferase. 

26. A method for identifying a potential modulator of a giycosyltransferase function comprising the 
steps: 

(a) selecting a computer representation of a compound complexed with a secondary or three- 
dimensional structure of a giycosyltransferase or a binding site as defined in any of tiie 
preceding claims; and 

(b) searching for molecules in a data base that are similar to the compound using a searching 
computer program, or replacing portions of the compound with similar chemical structures 
from a data base using a compound building computer program. 

27. A modulator of a giycosyltransferase identified by a method as claimed in any of the preceding 

claims. 

28. A method for designing potential inhibitors of a giycosyltransferase comprising the step of using 
the structural coordinates of a sugar nucleotide donor or acceptor or component thereof, deimed 
in relation to it spatial association with the three dimensional structure of a giycosyltransferase or 
a binding site as defined in any of the preceding claims, to generate a compound that is capable 
of associating with the giycosyltransferase or binding cavity thereof. 

29. A modulator of a giycosyltransferase based on a three-dimensional structure of a sugar 
nucleotide donor, an acceptor, or a component thereof, defined in relation to the sugar nucleotide 
donor's or acceptor's spatial association with a secondary or three-dimensional structure of a 
giycosyltransferase or binding site as defined in the preceding claims. 

30. A pharmaceutical composition comprising a modulator as claimed in any of the preceding claims 
either alone or with other active substances. 
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31. A method of treating a disease associated with a glycosyltransferase with inappropriate activity 
in a cellular organism, comprising: 

(a) administering a pharmaceutical composition as claimed in claim 30; and 

(b) activating or inhibiting a glycosyltransferase to treat the disease. 

32. Use of a modulator identified by the methods of any of the preceding claims in the preparation of 
a medicament to treat a disease associated with a glycosyltransferase with inappropriate activity 
in a cellular organism. 

33. Use of structural coordinates of a glycosyltransferase structure as set out in Table 1. 2, 3, or 4 to 
manufacture a medicament. 

34. Machine readable media encoded with data representing the structural coordinates of a secondary 
or three-dimensional structure of a glycosyltransferase or a binding site as defined in any of the 
preceding claims. 

35. A machine readable media as claimed in claim 34 wherein the data also includes structural 
coordinates for a nucleotide sugar donor, acceptor, metal co&ctor, or heavy metal atom. 



wo 00/78936 



PCT/CAOO/00725 




SUBSTITUTE SHEET (RULE 26) 



wo 00/78936 



PCT/CAOO/00725 




SUBSTITUTE SHEET (RULE 26) 



wo 00/78936 



PCTyCAOO/00725 




SUBSTITUTE SHEET (RULE 26) 



wo 00/78936 



PCT/CAOO/00725 



4/53 

Figure 4 
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FigMre 5 
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Figure 8A 
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Figure 8C 
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Figure 8D 
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Figure 8E 
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Figure 8F 
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Figure 9A 
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Figure 9B 
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Figure lOA 
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Figure lOB 
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Figure llA 
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Figmre IIB 
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Figure 
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Figure 12 
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Figure 15 
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Figure 18 
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Figure 19 




SUBSTITUTE SHEET (RULE 26) 



wo 00/78936 



30/53 

Figure 20 



PCT/CAOO/00725 




SUBSTITUTE SHEET (RULE 26) 



wo 00/78936 



31/53 

Figure 21 



PCT/CAOO/00725 




SUBSTITUTE SHEET (RULE 26) 



wo 00/78936 



PCT/CAOO/00725 



32/53 

Fignire 22 




SUBSTITUTE S H E E T (RULE 26) 



wo 00^8936 PCT/CA0«/00725 

33/53 

Figure 23 
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Figure 29 
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Figure 33C 
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Figure 34 
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Sequence Listing 
SEQIDNOl 

VWEDDLEVAPDFFEYFQATYPLLKADSL 
SEQIDN02 

VWEDDLEVAPDFFEYFRATYPLLKADPSL 
SEQIDN03 

VWEDDLEVAPDFFEYFQATYPLLRTDPSL 
SEQIDN04 

IITEDDLDIAPDFFSYFSNTRYLLEKDPSL 
SEQ ID NO. 5 

IVTEDDLDIGNDFFSYFRWGKQVLNSDDTl 
SEQrDN06 

RHYRWALGQIFHNFNYPAAVWEDDLEVAPDFKAFWDDWMRRPEQRKGRACVRPEI 
SEQ ID NO? 

TRYAALINQAIEMAEGEYITYATDDNIYMPDRYRIGDARFFWRVNHFYPFYPLDE 
SEQ ID NO S 

KLLNVGFKEALKDYDYNCFVFSDVDLIPMNDHWGGEDDDIYNRLAFRGMSVSRPNA 
SEQ ID NO 9 

LGTGHAMQQAAPFFADDEDILMLYGDVPLISVETGEYYITDIIALAYQEGREIVAVHP 
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FURTHER INFORMATION CONTINUED FROM PCT/ISA/ 210 



Continuation of Box I.l 

Although claim 31 Is directed to a method of treatment of the 
human/animal body, the search has been carried out and based on the 
alleged effects of the cooipound/coinpositlon. 

Although clains 34 and 35 could be considered as a mere presentation of 
Information, Rule 39.1 (v and v1) PCT, the search has been carried out as 
far as possible in our systematic documentation 

Continuation of Box 1.2 



Present claims 20 27 29 -32 relate to compounds defined by reference to a 
desirable characteristic or property, namely modulating 
glycosyl transferases 

The claims cover all compounds having this characteristic or property, 
whereas the application provides support within the meaning of Article 6 
PCT and/or disclosure within the meaning of Article 5 PCT for only 
UDP-GlcNAc. In the present case, the claims so lack support, and the 
application so lacks disclosure, that a meaningful search over the whole 
of the claimed scope is Impossible. Independent of the above reasoning, 
the claims also lack clarity (Article 6 PCT). An attempt is made to 
define the compound by reference to a result to be achieved. Again, this 
lack of clarity in the present case is such as to render a meaningful 
search over the whole of the claimed scope impossible. Consequently, the 
search has been carried out for those parts of the claims which appear to 
be clear, supported and disclosed, namely those parts relating to the 
UDP-GlcNAc. 

The applicant's attention is drawn to the fact that claims, or parts of 
claims, relating to inventions in respect of which no international 
search report has been established need not be the subject of an 
International preliminary examination (Rule 66.1(e) PCT). The applicant 
Is advised that the EPO policy when acting as an International 
Preliminary Examining Authority is normally not to carry out a 
preliminary examination on matter which has not been searched. This is 
the case irrespective of whether or not the claims are amended following 
receipt of the search report or during any Chapter II procedure. 



